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Proteins involved in the regulation of energy homeostasis 

Description 

This invention relates to the use of nucleic acid sequences encoding casein 
kinase 1 gamma (CSNK1G), GABA(A) receptor-associated protein 
(GABARAP), proliferation-associated 2G4 protein, 38kDa (PA2G4, also 
referred to as methionyl aminopeptidase homologous protein), molybdenum 
cofactor synthesis-step 1 protein (MOCS1), cell division cycle 10 protein 
homolog (CDC10, also referred to as septin and septin 7), pyruvate kinase 
(PK), calreticulin (CALR), or homologous proteins, and the polypeptides 
encoded thereby and to the use of these sequences or effectors thereof in 
the diagnosis, study, prevention, and treatment of diseases and disorders 
related to body-weight regulation, for example, but not limited to, 
metabolic diseases such as obesity as well as related disorders such as 
eating disorder, cachexia, diabetes mellitus, hypertension, coronary heart 
disease, hypercholesterolemia, dyslipidemia, osteoarthritis, and gallstones. 

Obesity is one of the most prevalent metabolic disorders in the world. It is 
a still poorly understood human disease that becomes more and more 
relevant for western society. Obesity is defined as an excess of body fat, 
frequently resulting in a significant impairment of health. Cardiovascular 
risk factors like hypertension, high blood levels of triglycerides and fasting 
glucose as well as low blood levels of HDL cholesterol are often linked to 
obesity. This typical cluster of symptoms is commonly defined as 
"metabolic syndrome" (Reaven, 2002, Circulation 106(3): 286-8 
reviewed). The metabolic syndrome often precedes the development of 
type ll diabetes and cardiovascular disease (McCook, 2002, JAMA 
288:2709-2716). Besides severe risks of illness such as diabetes, 
hypertension and heart disease, individuals suffering from obesity are often 
isolated socially. Human obesity is strongly influenced by environmental 
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and genetic factors, whereby the environmental influence is often a hurdle 
for the identification of (human) obesity genes. Obesity is influenced by 
genetic, metabolic, biochemical, psychological, and behavioral factors. As 
such, it is a complex disorder that must be addressed on several fronts to 
achieve lasting positive clinical outcome. Obese individuals are prone to 
ailments including: diabetes mellitus, hypertension, coronary heart disease, 
hypercholesterolemia, dyslipidemia, osteoarthritis, and gallstones. 

Obesity is not to be considered as a single disorder but a heterogeneous 
group of conditions with (potential) multiple causes. Obesity is also 
characterized by elevated fasting plasma insulin and an exaggerated insulin 
response to oral glucose intake (Koltermann, J. Clin. Invest 65, 1980, 
1 272-1 284) and a clear involvement of obesity in type 2 diabetes mellitus 
can be confirmed (Kopelman, Nature 404, 2000, 635-643). 

Even if several candidate genes have been described which are supposed 
to influence the homeostatic system(s) that regulate body mass/weight, 
like Ieptin, VCPI, VCPL, or the peroxisome proliferator-activated 
receptor-gamma co-activator, the distinct molecular mechanisms and/or 
molecules influencing obesity or body weight/body mass regulations are 
not known. 

Therefore, the technical problem underlying the present invention was to 
provide for means and methods for modulating (pathological) metabolic 
conditions influencing body-weight regulation and/or energy homeostatic 
circuits. The solution to said technical problem is achieved by providing the 
embodiments characterized in the claims. 

Accordingly, the present invention relates to genes with novel functions in 
body-weight regulation, energy homeostasis, metabolism, and obesity. The 
present invention discloses a specific gene involved in the regulation of 
body-weight, energy homeostasis, metabolism, and obesity, and thus in 
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disorders related thereto such as eating disorder, cachexia, diabetes 
mellitus, hypertension, coronary heart disease, hypercholesterolemia, 
dyslipidemia, osteoarthritis, and gallstones. The present invention describes 
the human genes encoding casein kinase 1 gamma, GABARAP, PA2G4, 
MOCS1, CDC10, PK, calreticulin, or homologous proteins as being 
involved in those conditions mentioned above. 

The term 'GenBank Accession number 7 relates to NCBI GenBank database 
entries (Benson et al, Nucleic Acids Res. 28, 2000, 15-18). 

The casein kinase I (CKI) family of protein kinases is a group of highly 
related, ubiquitously expressed serine/threonine kinases found in all 
eukaryotic organisms from protozoa to man (Vielhaber and Virshup, 2001 , 
IUBMB Life 51 (2):73-78). Recent advances in diverse fields, including 
developmental biology and chronobiology, have elucidated roles for CKI in 
regulating critical processes such as Wnt signaling, circadian rhythm, 
nuclear import, and Alzheimer's disease progression. Casein kinase I is a 
serine/threonine-specific protein kinase that constitutes most of the kinase 
activity in eukaryotic cells, where it is mainly localized in the nucleus, 
cytoplasm, and several membranes. The monomeric enzyme 
phosphorylates hierarchically a variety of substrates without the 
involvement of the second messenger in signal transduction. 

Casein kinase I, one of the first protein kinases identified biochemically, is 
known to exist in multiple isoforms in mammals. Three separate members 
of the CKI gamma subfamily were identified in testis: the isoforms CKI 
gamma 1, CKI gamma 2, and CKI gamma 3. The proteins are more than 
90% identical to each other within the protein kinase domain but only 
51-59% identical to other casein kinase I isoforms within this region. 
Message RNA for CKI gamma 3 was observed in testis, brain, heart, 
kidney, lung, liver, and muscle whereas CKI gamma 1 and CKI gamma 2 
messages were restricted to testis (Zhai et aL, 1995, J Biol Chem 
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270(21 ):1 271 7-1 2724). As shown in this invention, Taqman analysis 
revealed ubiquitous expression of CKI gamma 1 and CKI gamma 2, with 
strongest expression in testis. The enzymes phosphorylate typical in vitro 
casein kinase I substrates such as casein, phosvitin, and a synthetic 
peptide, D4. The known casein kinase I inhibitor CKI-7 also inhibits the CKi 
gamma's although less effectively than the CKI alpha or CKI delta 
isoforms. AH three CKI gamma's undergo autophosphorylation when 
incubated with ATP and Mg2 + . The YCKI and YCK2 genes in 
Saccharomyces cerevisiae encode casein kinase I homologs, defects in 
which lead to aberrant morphology and growth arrest (Zhai et al., supra). 

The GABA(A)-receptor-associated protein (GABARAP) is a small 17kDa 
microtubule associated protein that recognizes and binds the gamma 
subunit of Type A receptors of gamma-aminobutyric acid (GABA(A)) 
receptors which plays a central role in the synaptic targeting. GABARAP 
has also been reported to bind N-ethylmaleimide sensitive factor (NSF), a 
protein critical for intracellular trafficking events. GABARAP is specifically 
localized to intracellular membranes, including the Golgi network. The 
crystal structure of human GABARAP comprises an N-terminal helical 
subdomain and a ubiquitin-like C-terminal domain (Coyle et al., 2002, 
Neuron 33(1 ):63-74). Structure-based mutational analysis demonstrates 
that the N-terminal subdomain is responsible for tubulin binding while the 
C-terminal domain contains the binding site for the GABA(A). Coyle et al. 
(supra) show GABARAP can switch from a monomer to an extended linear 
polymer form that may function to assemble microtubules during the 
intracellular trafficking or postsynaptic clustering of GABA(A) receptors. 
Using the yeast two-hybrid screen, GABARAP has been identified as 
interactor of ULK1 (Unc-51-like kinase), suggesting an involvement in 
vesicle transport and axonal elongation in mammalian neurons (Okazaki et 
al., 2000, Brain Res. Mol. Brain Res. 86:1-12). No function in the 
regulation of metabolism has been reported for GABARAP or its human 
homolog. 
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The dinuclear metalloenzyme methionine aminopeptidases (MAPs) are 
proteases with important roles in protein processing, especially in 
proteolysis and peptidolysis (Datta B., 2000, Biochimie. 82(2):95-1 07). 
MAPs are involved in the removal of the N-terminal methionine from 
proteins and peptides (Lowther & Matthews, 2000, Biochim Biophys Acta 
1477(1-2):157-167). 

Highly homologous MAPs have been identified from various prokaryotic 
and eukaryotic organisms, for example E. coli, S. typhimurium, P. furiosus, 
Saccharomyces cerevisiae, Drosophila melanogaster, porcine, mouse, rat, 
and human. The Drosophila melanogaster gene CG10576 encodes a 
metallopeptidase family M24 methionyl aminopeptidase (EC:3.4.1 1 .1 8). A 
cell cycle-specifically modulated nuclear protein of 38 kDa (termed 
P38-2G4; PA2G4; ErbB-3 binding protein Ebp1) has been described to be 
ubiquitously expressed in mouse and human (Radomski & Jost, 1995, Exp 
Cell Res 220(2) :434-445; Lamartine et al., 1997, Cytogenet Cell Genet 
78(1):31-35). Substantial progress has recently been made in determining 
the structures of several members of this family. 

The identification of human MAPs as the target of putative anti-cancer 
drugs reiterates the importance of this family of enzymes. For example, the 
ErbB-3 binding protein (Ebp1; identical to PA2G4) which is interacting with 
the juxtamembrane domain of ErbB-3 which is human epidermal growth 
factor receptor-3 (class I tyrosine kinase receptor) involved in signal 
transduction pathways that regulate cell growth and differentiation. ErbB-3 
has low tyrosine kinase activity, suggesting that it may function more as 
an adaptor in signaling than as a kinase. The binding of Ebp1 to ErbB-3 
inhibits the proliferation and induces the differentiation of human breast 
cancer cells. The mechanisms of these effects are unknown (see, for 
example, Lessor et al., 2000, J Cell Physiol 1 83(3) :321 -329, Yoo et al., 
2000, Br J Cancer 82(3):683-690; Xia et al., 2001, J Cell Physiol 
187(2):209-417). 
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The Drosophila gene cassette Mocsl encodes for Molybdenum cofactor 
synthesis-step 1 proteins A, A-B, and B (Mocsl A, Mocs! A-B, and 
Mocsl B) which are involved in Molybdopterin cofactor biosynthesis. As 
shown in this invention, Mocsl is most homologous to the isoforms of 
human molybdenum cofactor biosynthesis protein 1 . Molybdenum is an 
essential cofactor in* many enzymes, but must first be complexed by 
molybdopterin, whose synthesis requires four enzymatic activities (see, for 
example, Gray & Nicholls, 2000, RNA 6(7):928-36). The first two enzymes 
of this pathway are encoded by the MOCS1 locus in humans. A well- 
conserved novel mRNA splicing phenomenon produces both an apparently 
bicistronic transcript, as well as a distinct class of monocistronic 
transcripts. The latter are created by a variety of splicing mechanisms 
resulting in fusion of the MOCS1A and MOCS1B open reading frames. 
Therefore, a single bifunctional protein is encoded embodying both 
MOCS1A and MOCS1B activities. This coexpression profile was observed 
in vertebrates (including human, mouse, cow, rabbit, opossum, and 
chicken) and invertebrates (e.g. fruit fly and nematode) spanning at least 
700 million years of evolution. 

It has been described that Molybdate (Mo) exerts insulinomimetic effects in 
vitro. Reul et al. (1997, J Endocrinol 1 55(1 ):55-64) showed that Mo can 
improve glucose homeostasis in genetically obese, insulin-resistant ob/ob 
mice. Oral administration of Mo for 7 weeks did not affect body weight, 
but decreased the hyperglycaemia of obese mice to the levels of lean (L) 
( + / + ) mice, and reduced the hyperinsulinaemia to one-sixth of 
pretreatment levels. 

Human MoCo deficiency is a fatal disease resulting in severe neurological 
damage and death in early childhood. Molybdenum cofactor (MoCo) 
deficiency leads to a combined deficiency of the molybdoenzymes. 
Effective therapy is not available for this rare disease. Most patients harbor 
MOCS1 mutations, which prohibit formation of a precursor, or carry 
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MOCS2 mutations, which abrogate precursor conversion to molybdopterin. 
A gephyrin gene (GEPH) deletion was identified in a patient with symptoms 
typical of MoCo deficiency (Reiss et al., 2001, Am J Hum Genet 
68(1 ):208-l 3). Gephyrin was originally identified as a 
membrane-associated protein that is essential for the postsynaptic 
localization of receptors for the neurotransmitters glycine and GABA(A). 

Septins are novel GTPase proteins that are broadly distributed in many 
eukaryotes except plants. The septins are an evolutionary conserved family 
of proteins that are involved in cytokinesis (the final event of the cell 
division cycle) and other aspects of cell-surface organization (reviewed in 
Cooper & Kiehart, 1996; Field & Kellog, 1999). Members of the septin 
family contain sequences characteristic of the GTPase superfamily of 
proteins. 

For example, in Saccharomyces cerevisiae, the Cdc3, CddO, Cdc11, 
Cdc12 and Shs1/Sep7 septins assemble as a ring that marks the 
cytokinetic plane throughout the budding cycle (see, for example, Sidd et 
al, 2001, Microbiology 147(Pt 6):1437-50). This structure participates in 
different aspects of morphogenesis, such as selection of cell polarity, 
localization of chitin synthesis, the switch from hyperpolar to isotropic bud 
growth after bud emergence and the spatial regulation of septation. The 
septin cytoskeleton assembles at the pre-bud site before bud emergence, 
remains there during bud growth and duplicates at late mitosis eventually 
disappearing after cell separation. The high degree of conservation, 
ubiquitous expression and proven role in cytokinesis suggests septins are 
certain to be important players in regulating cell architecture and function 
(see, for example, Field et al., 1996, J Cell Biol 1 33(3):605-61 6). 

For example, the Drosophila gene peanut (pnut) encodes a septin homolog, 
microtubule binding protein involved in cytokinesis (see, for example, 
Neufeld and Rubin, 1994, Cell 77(3):371 -379). Pnut protein is localized to 
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the cleavage furrow of dividing cells during cytokinesis and to the 
intercellular bridge connecting postmitotic daughter cells. In addition to its 
role in cytokinesis, pnut displays genetic interactions with seven in 
absentia (sina), a gene required for neuronal fate determination in the 
compound eye and involved in ubiquitin-dependent protein degradation. 
The amino acid sequence of the Drosophila gene pnut is highly homologous 
to that of Saccharomyces cerevisiae CDC3, CDC10, CDC11, CDC12, 
Candida alicancs (CaCDCIO), the Drosophila genes Sep1, Sep2, and the 
mammalian genes BH5, cell cycle division 10 (CDC10), septin Nedd5, 
Diff6, septin 2 (Sep2), and septin 3 (Sep3), which are implicated in 
cytokinesis and cell polarity (Xiong et al., 1999, Mech Dev 
86(1-2):183-191). 

Enzymes of the glycolytic pathway convert the sugar glucose to pyruvate 
while simultaneously producing ATP. The pathway also provides building 
blocks for the synthesis of cellular components such as long-chain fatty 
acids. After glycolysis, pyruvate is converted to acetyl-Coenzyme A, which 
enters the citric acid cycle. Glycolytic enzymes include hexokinase, 
phosphoglucoseS-isomerase, phosphofructokinase, aldolase, triose, 
phosphate isomerase, glyceraldehyde, 3-phosphatedehydrogenase, 
phosphoglycerate kinase, phosphoglyceromutase, enolase, and pyruvate 
kinase. Of these, phosphofructokinase, hexokinase, and pyruvate kinase 
are important in regulating the rate of glycolysis. 

Carbohydrates mediate their conversion to triglycerides in the liver by 
promoting both rapid posttranslational activation of rate-limiting glycolytic 
and lipogenic enzymes and transcriptional induction of the genes encoding 
many of these same enzymes. A transcription factor has been described 
that recognizes the carbohydrate response element (ChRE) within the 
promoter of the L-type pyruvate kinase (LPK) gene. The DNA-binding 
activity of this ChRE-binding protein in rat livers is specifically induced by 
a high carbohydrate diet. It was suggested that the ChRE-binding protein 
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may contribute to the imbalance between nutrient utilization and storage 
characteristic of obesity (Yamashita et al., 2001 , Proc Natl Acad Sci U S A 
31 ;98(16):91 16-21 

Obese (fa/fa) Zucker rat shows altered thermogenesis and changes in both 
lipid and carbohydrate metabolism (see, for example, Sanchez-Gutierrez, 
2000, Arch Biochem Biophys 373(1 ):249-54; Perez et al., 1998, Int J 
Obes Relat Metab Disord 22(7):667-72). The activities of glucokinase and 
L-pyruvate kinase increased in fed obese (fa/fa) rats compared with fed 
lean (fa/-) animals, but decreased during starvation. The mRNA levels of 
glycolytic enzymes such as glucokinase and L-pyruvate kinase in fed obese 
rats were higher than in fed lean animals. During starvation, they 
decreased in lean and obese rats. The stimulation of gluconeogenesis by 
epinephrine was accompanied by an inactivation of both pyruvate kinase 
and 6-phosphofructo 2-kinase in rat hepatocytes. 

Pyruvate kinase is a key enzyme in glycogen metabolism. Mammalian 
pyruvate kinases of different tissues are distinct, their characteristics being 
related to tissue metabolic requirements (for example see Bigley et al., 
1974, Enzyme 1 7(5):297-306). Pyruvate kinase is also known as 
ATP:pyruvate phosphotransferase (EC 2.7.1.40). At least 3 molecular 
forms with pyruvate kinase activity are known (Bigley et al., 1 968, Enzym. 
Biol. Clin. 9: 10-20). The form that is deficient in a type of hemolytic 
anemia is the red cell variety, PK1 . PK2 is found in kidney. PK3 is found in 
leukocytes, muscle, platelets, and brain but not in red cells or kidney. PK3 
is a tetrameric protein and all subunits are alike. The enzyme is insensitive 
to fructose-1,6-diphosphate. Tsutsumi et al. (1988, Genomics 2(1):86-9) 
showed that pyruvate kinase occurs in 4 isozymic forms (L, R, M1, M2). 

The Drosophila gene CG9429 (Crc, calreticulin) encodes for a putative 
calcium binding protein (chaperone) which is a component of the 
endoplasmic reticulum in Drosophila. Intrapro analysis of this gene reveals 
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an endoplasmic reticulum targeting sequence, calreticulum family domains, 
and aspartic acid-rich regions. As shown in this invention, the Drosophila 
Crc is most homologous to human calreticulin (other names: CRP55, 
calregulin, HACBP, ERP60, 52 kDa Ribonucleoprotein autoantigen 
RO/SS-A, sicca syndrome antigen A or autoantigen Ro; for example, 
GenBank Accession Number NMJ304343 and XM_032030 (identical 
proteins)) and to mouse calreticulin (GenBank Accession Numbers 
AAH03453.1, AAH03453, and BC003453). Calreticulin is a highly 
conserved, multifunctional protein that acts as a major calcium-binding 
protein, most abundant in the lumen of the endoplasmic and sarcoplasmic 
reticulum. The protein has well-recognized physiological roles in the ER as 
a molecular chaperone and Ca(2 + )-signalling molecule, Calreticulin has 
also been found in other membrane-bound organelles, at the cell surface 
and in the extracellular environment, where it has recently been shown to 
exert a number of physiological and pathological effects, see, for example, 
review by Johnson et al, 2001, Trends Cell Biol 1 1(3): 122-9. In addition to 
the calcium-binding functions and molecular chaperone function, 
calreticulin has been characterized as an extracellular lectin, an intracellular 
mediator of integrin function, an inhibitor of steroid hormone-regulated 
gene expression and a C1q-binding protein (see, for example, review by 
Coppolino et al., 1998, Int J Biochem Cell Biol 30(5):553-8). Calreticulin 
binds to antibodies in certain sera of systemic lupus and Sjogren patients 
which contain anti-Ro/SSA antibodies. Increased autoantibody titer against 
human calreticulin is found in infants with complete congenital heart block 
of both the IgG and IgM classes. 

So far, it has not been described that casein kinase 1 gamma, GABARAP, 
PA2G4, MOCS1, CDC10, PK, calreticulin, or homologous proteins are 
involved in the regulation of energy homeostasis and body-weight 
regulation and related disorders, and thus, no functions in metabolic 
diseases and other diseases as listed above have been discussed. In this 
invention we demonstrate that the correct gene dose of casein kinase 1 



WO 03/066086 PCT/EP03/01094 

-11- 

gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, calreticulin, or 
homologous proteins is essential for maintenance of energy homeostasis. 
A genetic screen was used to identify that mutation of genes encoding 
casein kinase 1 gamma, GABARAP, PA2G4, MOCS1 , CDC10, PK, or 
calreticulin homologous proteins cause obesity, reflected by a significant 
increase of triglyceride content, the major energy storage substance. 

The function of calreticulin and casein kinase 1 gamma in metabolic 
disorders is further validated by data obtained from additional screens. For 
example, an additional screen using Drosophila mutants with modifications 
of the eye phenotype identified a modification of UCP activity by 
calreticulin, thereby leading to an altered mitochondrial activity. An 
additional screen using Drosophila mutants with modifications of the eye 
phenotype identified an interaction of casein kinase 1 gamma with adipose, 
a protein regulating, causing or contributing to obesity. These findings 
suggest the presence of similar activities of these described homologous 
proteins in humans that provides insight into diagnosis, treatment, and 
prognosis of metabolic disorders. 

Polynucleotides encoding proteins with homologies to casein kinase 1 
gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, or calreticulin are 
suitable to investigate diseases and disorders as described above. Further 
new compositions useful in diagnosis, treatment, and prognosis of diseases 
and disorders as described above are provided. 

Before the present proteins, nucleotide sequences, and methods are 
described, it is understood that this invention is not limited to the particular 
methodology, protocols, cell lines, vectors, and reagents described as 
these may vary. It is also to be understood that the terminology used 
herein is for the purpose of describing particular embodiments only, and is 
not intended to limit the scope of the present invention that will be limited 
only by the appended claims. Unless defined otherwise, all technical and 
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scientific terms used herein have the same meanings as commonly 
understood by one of ordinary skill in the art to which this invention 
belongs. Although any methods and materials similar or equivalent to those 
described herein can be used in the practice or testing of the present 
invention, the preferred methods, devices, and materials are now 
described. All publications mentioned herein are incorporated herein by 
reference for the purpose of describing and disclosing the cell lines, 
vectors, and methodologies that are reported in the publications which 
might be used in connection with the invention. Nothing herein is to be 
construed as an admission that the invention is not entitled to antedate 
such disclosure. 

The present invention discloses that casein kinase 1 gamma, GABARAP, 
PA2G4, MOCS1, CDC10, PK, calreticulin, or homologous proteins are 
regulating the energy homeostasis and fat metabolism especially the 
metabolism and storage of triglycerides, and polynucleotides, which 
identify and encode the proteins disclosed in this invention. The invention 
also relates to vectors, host cells, antibodies, and recombinant methods for 
producing the polypeptides and polynucleotides of the invention. The 
invention also relates to the use of these sequences in the diagnosis, 
study, prevention, and treatment of diseases and disorders, for example, 
but not limited to, metabolic diseases such as obesity as well as related 
disorders such as eating disorder, cachexia, diabetes mellitus, 
hypertension, coronary heart disease, hypercholesterolemia, dyslipidemia, 
osteoarthritis, and gallstones. 

Casein kinase 1 gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, and 
calreticulin homologous proteins and nucleic acid molecules coding 
therefore are obtainable from insect or vertebrate species, e.g. mammals or 
birds. Particularly preferred are homologous nucleic acids, particularly 
nucleic acids encoding a human casein kinase 1, gamma 1, human casein 
kinase 1, gamma 2, human casein kinase 1, gamma 3, human GABARAP, 
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human GABARAP like 1, human GABARAP like 2, human GABARAP like 3, 
human proliferation-associated 2G4 protein, the human MOCS1 isoforms, 
human CDC10, human pyruvate kinase, muscle, human pyruvate kinase, 
liver and RBC, human calreticulin, and/or human calreticulin 2. 

The invention particularly relates to a nucleic acid molecule encoding a 
polypeptide contributing to regulating the energy homeostasis and the 
metabolism of triglycerides, wherein said nucleic acid molecule comprises 
(a) the nucleotide sequence of (i) Drosophila gilgamesh (gish), human 
casein kinase 1, gamma 1 (SEQ ID NO: 1), human casein kinase 1, 
gamma 2 (SEQ ID NO: 3), human casein kinase 1, gamma 3 (SEQ ID 
NO: 5), (ii) Drosophila Gadfly Accession Number CG1534 (synonym 
for Gadfly Accession Number CG32672 and Genbank Accession 
Number. NM_1 67245), human GABARAP (SEQ ID NO: 7), human 
GABARAP like 1 (SEQ ID NO: 9), human GABARAP like 2 (SEQ ID 
NO: 11), human GABARAP like 3 (SEQ ID NO: 13), (iii) Drosophila 
Gadfly Accession Number CG10576, human PA2G4 (SEQ ID NO: 
15), (iv) Drosophila Mocsl, human MOCSA (SEQ ID NO: 17), 
human MOCS1 isoform 1 (SEQ ID NO: 19), human MOCS1 isoform 
2 (SEQ ID NO: 21), human MOCS1 isoform 3 (SEQ ID NO: 23), (v) 
Drosophila peanut (pnut), human CDC10 (SEQ ID NO: 25), (vi) 
Drosophila Gadfly Accession Number CG7069, human pyruvate 
kinase, muscle (SEQ ID NO: 27), human pyruvate kinase, liver and 
RBC (SEQ ID NO: 30), (vii) Drosophila calreticulin (Ore), human 
calreticulin (SEQ ID NO: 32), human calreticulin 2 (SEQ ID NO:34), 
and/or a sequence complementary thereto, 

(b) a nucleotide sequence which hybridizes at 50°C in a solution 
containing 1 x SSC and 0.1 % SDS to a sequence of (a), 

(c) a sequence corresponding to the sequences of (a) or (b) within the 
degeneration of the genetic code, 

(d) a sequence which encodes a polypeptide which is at least 85%, 
preferably at least 90%, more preferably at least 95%, more 
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preferably at least 98% and up to 99,6% identical to the amino acid 
sequences of human casein kinase 1 , gamma 1 (SEQ ID NO: 2), 
human casein kinase 1 , gamma 2 (SEQ ID NO: 4), human casein 
kinase 1, gamma 3 (SEQ ID NO: 6), human GABARAP (SEQ ID NO: 
8), human GABARAP like 1 (SEQ ID NO: 10), human GABARAP like 
2 (SEQ ID NO: 12), human GABARAP like 3 (SEQ ID NO: 14), 
human PA2G4 (SEQ ID NO: 16), human MOCSA (SEQ ID NO: 18), 
human MOCS1 isoform 1 (SEQ ID NO: 20), human MOCS1 isoform 
2 (SEQ ID NO: 22), human MOCS1 isoform 3 (SEQ ID NO: 24), 
human CDC10 (SEQ ID NO: 26), human pyruvate kinase, muscle, 
isozyme M1 (SEQ ID NO: 28), human pyruvate kinase, muscle, 
isozyme M2 (SEQ ID NO: 29), human pyruvate kinase, liver and RBC 
(SEQ ID NO: 31 ), human calreticulin (SEQ ID NO: 33), and/or human 
calreticulin 2 (SEQ ID NO:35), 

(e) a sequence which differs from the nucleic acid molecule of (a) to (d) 
by mutation and wherein said mutation causes an alteration, 
deletion, duplication and/or premature stop in the encoded 
polypeptide or 

(f) a partial sequence of any of the nucleotide sequences of (a) to (e) 
having a length of at least 1 5 bases, preferably at least 20 bases, 
more preferably at least 25 bases and most preferably at least 50 
bases. 

The invention is based on the discovery that casein kinase 1 gamma 
(CSNK1 G), GABA(A) receptor-associated protein (GABARAP), 
proliferation-associated 2G4 protein, 38kDa (PA2G4, also referred to as 
methionyl aminopeptidase homologous protein), molybdenum cofactor 
synthesis-step 1 protein (MOCS1), cell division cycle 10 protein homolog 
(CDC10, also referred to as septin 7), pyruvate kinase (PK), calreticulin 
(CALR) or homologous proteins (herein referred to as casein kinase 1 
gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, or calreticulin) and the 
polynucleotides encoding these, are involved in the regulation of 
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triglyceride storage and therefore energy homeostasis. The invention 
describes the use of these polypeptides or fragments thereof, 
polynucleotides or fragments thereof and effectors (receptors) of these 
molecules, e.g. antibodies, biologically active nucleic acids, such as 
antisense molecules, RNAi molecules or ribozymes, aptamers, peptides or 
low-molecular weight organic compounds recognizing said polynucleotides 
or polypeptides for the diagnosis, study, prevention, or treatment of 
diseases and disorders related thereto, including metabolic diseases such 
as obesity as well as related disorders such as eating disorder, cachexia, 
diabetes mellitus, hypertension, coronary heart disease, 
hypercholesterolemia, dyslipidemia, osteoarthritis, and gallstones. 

Accordingly, the present invention relates to genes with novel functions in 
body-weight regulation, energy homeostasis, metabolism, and obesity. To 
find genes with novel functions in energy homeostasis, metabolism, and 
obesity, a functional genetic screen was performed with the model 
organism Drosophila melanogaster (Meigen). The ability to manipulate and 
screen the genomes of model organisms such as the fly Drosophila 
melanogaster provides a powerful tool to analyze biological and 
biochemical processes that have direct relevance to more complex 
vertebrate organisms due to significant evolutionary conservation of genes, 
cellular processes, and pathways (see, for example, Adams M. D. et al., 
(2000) Science 287: 2185-2195). Identification of novel gene functions in 
model organisms can directly contribute to the elucidation of correlative 
pathways in mammals (humans) and of the methods of modulating them. 
A correlation between a pathology model (such as changes in triglyceride 
levels as indication for metabolic syndrome including obesity) and the 
modified expression of a fly gene can identify the association of the human 
ortholog with the particular human disease. One resource for screening 
was a proprietary Drosophila melanogaster stock collection of EP-Iines. 
Additionally, the publicly available EP-collection was screened. The 
P-vector of both collections has Gal4-UAS-binding sites fused to a basal 
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promoter that can transcribe adjacent genomic Drosophila sequences upon 
binding of Gal4 to UAS-sites. This enables the EP-fine collection for 
overexpression of endogenous flanking gene sequences. In addition, 
without activation of the UAS-sites, integration of the EP-element into the 
gene is likely to cause a reduction of gene activity, and allows determining 
its function by evaluating the loss-of-function phenotype. 

Triglycerides are the most efficient storage for energy in cells. Obese 
people mainly show a significant increase in the content of triglycerides. In 
order to isolate genes with a function in energy homeostasis, several 
thousand proprietary and publicly available EP-Iines were tested for their 
triglyceride content after a prolonged feeding period (see Examples for 
more detail). Lines with significantly changed triglyceride content were 
selected as positive candidates for further analysis. 

In this invention, the content of triglycerides of a pool of flies with the 
same genotype was analyzed after feeding for six days using a triglyceride 
assay, as, for example, but not for limiting the scope of the invention, is 
described below in the examples section. Male flies homozygous for the 
integration of vectors for Drosophila lines HD-EP(3)37409, EP(3)3271, 
EP(3)3688, EP(2)2036, EP(3)3224, EP(3)3321, EP(3)0834, and 
EP(3)0979, and hemizygous for the integration of vectors for Drosophila 
line PX6298.1, were analyzed in assays measuring the triglyceride 
contents of these flies, illustrated in more detail in the EXAMPLES section. 
The results of the triglyceride content analysis are shown in FIGURES 1 , 6, 
1 1, 16, 21, 26, and 30. 

Adipose (adp) is a protein that has been described as regulating, causing or 
contributing to obesity in an animal or human (see WO 01/96371). 
Transgenic flies over-expressing the adipose gene in the developing 
Drosophila eye were generated and analyzed for modifications of the eye 
phenotype (for example, an enhancement or a suppression of the 
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phenotype). Mutations changing the eye phenotype affect genes that 
modify the activity of adipose. Fly line HD-EP(3)37409 was found to be an 
enhancer of the described eye phenotype. This result is strongly suggesting 
an interaction of gilgamesh gene with adipose since the integration of 
HD-EP(3)37409 was found to be located at the gilgamesh locus. This is 
supporting the function of gilgamesh and homologous proteins in the 
regulation of the energy homeostasis. 

An additional screen using Drosophila mutants with modifications of the 
eye phenotype identified a modification of UCP activity by calreticulin, 
thereby leading to an altered mitochondrial activity. 

Genomic DNA sequences were isolated that are localized to the EP vector 
(herein HD-EP(3)37409, PX6298.1, EP{3)3271, EP(3)3688, EP{2)2036 / 
EP(3)3224 / EP(3)3321, EP{3)0834, and EP(3)0979) integration. Using 
those isolated genomic sequences public databases like Berkeley 
Drosophila Genome Project (GadFly; see also FlyBase (1 999) Nucleic Acids 
Research 27:85-88) were screened thereby identifying the integration site 
of the vectors, and the corresponding gene, described in more detail in the 
EXAMPLES section. The molecular organization of the gene is shown in 
FIGURES 2, 7, 12, 17, 22, 27, and 31. 

The Drosophila genes and proteins encoded thereby with functions in the 
regulation of triglyceride metabolism were further analysed using the 
BLAST algorithm searching in publicly available sequence databases and 
mammalian homologs were identified (see FIGURES 3, 4, 8, 9, 13, 14, 18, 
19, 23, 24, 28, 29, 32, and 33). 

The function of the mammalian homologs in energy homeostasis was 
further validated in this invention by analyzing the expression of the 
transcripts in different tissues and by analyzing the role in adipocyte 
differentiation. Expression profiling studies (see Examples for more detail) 
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confirm the particular relevance of the protein(s) of the invention as 
regulators of enery metabolism in mammals. Further, we show that the 
proteins of the invention are regulated by fasting and by genetically 
induced obesity. In this invention, we used mouse models of insulin 
resistance and/or diabetes, such as mice carrying gene knockouts in the 
leptin pathway (for example, ob (leptin) or db (leptin receptor) mice) to 
study the expression of the protein of the invention. Such mice develop 
typical symptoms of diabetes, show hepatic lipid accumulation and 
frequently have increased plasma lipid levels (see Bruning et al, 1998, Mol, 
Cell. 2:449-569). 

The invention also encompasses polynucleotides that encode the proteins 
of the invention and homologous proteins. Accordingly, any nucleic acid 
sequence, which encodes the amino acid sequences of the proteins of the 
invention and homologous proteins, can be used to generate recombinant 
molecules that express the proteins of the invention and homologous 
proteins. In a particular embodiment, the invention encompasses a nucleic 
acid encoding (i) Drosophila gilgamesh (gish), human casein kinase 1, 
gamma 1, human casein kinase 1, gamma 2, human casein kinase 1, 
gamma 3, (ii) Drosophila Gadfly Accession Number CG1534 (synonym for 
Gadfly Accession Number CG32672 and Genbank Accession Number. 
NIVM 67245), human GABARAP, human GABARAP like 1, human 
GABARAP like, human GABARAP like 3, (iii) Drosophila Gadfly Accession 
Number CG10576, human PA2G4, (iv) Drosophila Mocsl, human MOCSA, 
human MOCS1 isoform 1, human MOCS1 isoform 2, human MOCS1 
isoform 3, (v) Drosophila peanut (pnut), human CDC10, (vi) Drosophila 
Gadfly Accession Number CG7069, human pyruvate kinase, muscle, 
human pyruvat kinase, liver and RBC, (vii) Drosophila calreticulin (Crc), 
human calreticulin, or human calreticulin 2; referred to herein as the 
proteins of the invention. It will be appreciated by those skilled in the art 
that as a result of the degeneracy of the genetic code, a multitude of 
nucleotide sequences encoding casein kinase 1 gamma, GABARAP, 
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PA2G4, MOCS1 , CDC10, PK, calreticulin, or homologous proteins, some 
bearing minimal homology to the nucleotide sequences of any known and 
naturally occurring gene, may be produced. Thus, the invention 
contemplates each and every possible variation of nucleotide sequence that 
could be made by selecting combinations based on possible codoh choices. 

Also encompassed by the invention are polynucleotide sequences that are 
capable of hybridizing to the claimed nucleotide sequences, and in 
particular, those of the polynucleotide encoding (i) Drosophila gilgamesh 
(gish), human casein kinase 1, gamma 1, human casein kinase 1, gamma 
2, human casein kinase 1, gamma 3, (ii) Drosophila Gadfly Accession 
Number CG1534 (synonym for Gadfly Accession Number CG32672 and 
Genbank Accession Number. NM_1 67245), human GABARAP, human 
GABARAP like 1, human GABARAP like, human GABARAP like 3, (iii) 
Drosophila Gadfly Accession Number CG 10576, human PA2G4, (iv) 
Drosophila Mocsl, human MOCSA, human MOCS1 isoform 1, human 
MOCS1 isoform 2, human MOCS1 isoform 3, (v) Drosophila peanut (pnut), 
human CDC10, (vi) Drosophila Gadfly Accession Number CG7069, human 
pyruvate kinase, muscle, human pyruvate kinase, liver and RBC, (vii) 
Drosophila calreticulin (Crc), human calreticulin, or human calreticulin 2, 
under various conditions of stringency. Hybridization conditions are based 
on the melting temperature (Tm) of the nucleic acid binding complex or 
probe, as taught in Wahl, G. M. and S. L. Berger (1 987: Methods Enzymol. 
152:399-407) and Kimmel, A. R. (1987; Methods Enzymol. 1 52:507-51 1 ), 
and may be used at a defined stringency. Preferably, hybridization under 
stringent conditions means that after washing for 1 h with 1 x SSC and 
0.1 % SDS at 50 °C, preferably at 55 °C, more preferably at 62 °C and most 
preferably at 68°C, particularly for 1 h in 0.2 x SSC and 0.1% SDS at 
50°C, preferably at 55°C, more preferably at 62°C and most preferably at 
68°C, a positive hybridization signal is observed. Altered nucleic acid 
sequences encoding casein kinase 1 gamma, GABARAP, PA2G4, MOCS1, 
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CDC10, PK, calreticulin, or homologous proteins which are encompassed 
by the invention include deletions, insertions, or substitutions of different 
nucleotides resulting in a polynucleotide that encodes the same or a 
functionally equivalent casein kinase 1 gamma, GABARAP, PA2G4, 
MOCS1 , CDC10, PK, calreticulin, or homologous proteins. 

The encoded proteins may also contain deletions, insertions, or 
substitutions of amino acid residues, which produce a silent change and 
result in functionally equivalent casein kinase 1 gamma, GABARAP, 
PA2G4, MOCS1, CDC10, PK, calreticulin, or homologous proteins. 
Deliberate amino acid substitutions may be made on the basis of similarity 
in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the 
amphipathic nature of the residues as long as the biological activity of 
casein kinase 1 gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, 
calreticulin, or homologous proteins is retained. Furthermore, the invention 
relates to peptide fragments of the proteins or derivatives thereof such as 
cyclic peptides, retro-inverso peptides mimetics having a length of at least 
4, preferably at least 6 and up to 50 amino acids. 

Also included within the scope of the present invention are alleles of the 
genes encoding casein kinase 1 gamma, GABARAP, PA2G4, MOCS1, 
CDC10, PK, calreticulin, or homologous proteins. As used herein, an 
"allele" or "allelic sequence" is an alternative form of the gene, which may 
result from at least one mutation in the nucleic acid sequence. Alleles may 
result in altered mRNAs or polypeptides whose structures or function may 
or may not be altered. Any given gene may have none, one, or many allelic 
forms. Common mutational changes, which give rise to alleles, are 
generally ascribed to natural deletions, additions, or substitutions of 
nucleotides. Each of these types of changes may occur alone, or in 
combination with the others, one or more times in a given sequence. 



WO 03/066086 



PCT/EP03/01094 



- 21 - 

The nucleic acid sequences encoding casein kinase 1 gamma, GABARAP, 
PA2G4, MOCS1 , CDC10, PK, calreticulin, or homologous proteins may be 
extended utilizing a partial nucleotide sequence and employing various 
methods known in the art to detect upstream sequences such as 
promoters and regulatory elements. For example, one method which may 
be employed, "restriction-site" PGR, uses universal primers to retrieve 
unknown sequence adjacent to a known locus (Sarkar, G. (1993) PCR 
Methods Applic. 2:318-322). Inverse PCR may also be used to amplify or 
extend sequences using divergent primers based on a known region 
(Triglia, T. et al. (1988) Nucleic Acids Res. 16:8186). 

Another method which may be used is capture PCR which involves PCR 
amplification of DNA fragments adjacent to a known sequence in human 
and yeast artificial chromosome DNA (Lagerstrom, M. et al. (PCR Methods 
Applic. 1:111-119). Another method which may be used to retrieve 
unknown sequences is that of Parker, J. D. et al. (1991; Nucleic Acids 
Res. 19:3055-3060). Additionally, one may use PCR, nested primers, and 
PROMOTERFINDER libraries to walk in genomic DNA (Clontech, Palo Alto, 
Calif.). This process avoids the need to screen libraries and is useful in 
finding intron/exon junctions. 

In order to express a biologically active casein kinase 1 gamma, GABARAP, 
PA2G4, MOCS1, CDC10, PK, calreticulin, or homologous protein, the 
nucleotide sequences encoding casein kinase 1 gamma, GABARAP, 
PA2G4, MOCS1, CDC10, PK, calreticulin, or homologous proteins 
functional equivalents, may be inserted into appropriate expression vectors, 
i.e., a vector, which contains the necessary elements for the transcription 
and translation of the inserted coding sequence. Methods, which are well 
known to those skilled in the art, may be used to construct expression 
vectors containing sequences encoding casein kinase 1 gamma, 
GABARAP, PA2G4, MOCS1, CDC10, PK, calreticulin, or homologous 
proteins and appropriate transcriptional and translational control elements. 
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Regulatory elements include for example a promoter, an initiation codon, a 
stop codon, a mRNA stability regulatory element, and a polyadenylation 
signal. Expression of a polynucleotide can be assured by (i) constitutive 
promoters such as the Cytomegalovirus (CMV) promoter/enhancer region, 

(ii) tissue specific promoters such as the insulin promoter (see, Soria et al., 
2000, Diabetes 49:157), SOX2 gene promoter {see Li et al., (1998) Curr. 
Biol. 8:971-974), Msi-1 promoter (see Sakakibara et al., (1997) J. 
Neuroscience 17:8300-8312), alpha-cardia myosin heavy chain promoter 
or human atrial natriuretic factor promoter (Klug et al., (1996) J. clin. 
Invest 98:216-224; Wu et al., (1989) J. Biol. Chem. 264:6472-6479) or 

(iii) inducible promoters such as the tetracycline inducible system. 
Expression vectors can also contain a selection agent of marker gene that 
confers antibiotic resistance such as the neomycin, hygromycin or 
puromycin resistance genes. These methods include in vitro recombinant 
DNA techniques, synthetic techniques, and in vivo genetic recombination. 
Such techniques are described in Sambrook, J. et al. (1989) Molecular 
Cloning, A Laboratory Manual, Cold Spring Harbor Press, Plainview, N.Y., 
and Ausubel, F. M. et al. (1989) Current Protocols in Molecular Biology, 
John Wiley & Sons, New York, N.Y. 

A variety of expression vector/host systems may be utilized to contain and 
express sequences encoding casein kinase 1 gamma, GABARAP, PA2G4, 
MOCS1, CDC10, PK, calreticulin, or homologous proteins. These include, 
but are not limited to, micro-organisms such as bacteria transformed with 
recombinant bacteriophage, plasmid, or cosmid DNA expression vectors; 
yeast transformed with yeast expression vectors; insect cell systems 
infected with virus expression vectors (e.g., baculovirus); plant cell 
systems transformed with virus expression vectors (e.g., cauliflower 
mosaic virus, CaMV; tobacco mosaic virus, TMV) or with bacterial 
expression vectors (e.g., Ti or PBR322 plasmids); or animal cell systems. 
The "control elements" or "regulatory sequences" are those non-translated 
regions of the vector-enhancers, promoters, 5' and 3' untranslated regions 
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which interact with host cellular proteins to carry out transcription and 
translation. Such elements may vary in their strength and specificity. 
Depending on the vector system and host utilized, any number of suitable 
transcription and translation elements, including constitutive and inducible 
promoters, may be used. 

The presence of polynucleotide sequences encoding casein kinase 1 
gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, calreticulin, or 
homologous proteins can be detected by DNA-DNA or DNA-RNA 
hybridization and/or amplification using probes or portions or fragments of 
polynucleotides encoding casein kinase 1 gamma, GABARAP, PA2G4, 
MOCS1, CDC10, PK, calreticulin, or homologous proteins. Nucleic acid 
amplification based assays involve the use of oligonucleotides or oligomers 
based on the sequences encoding casein kinase 1 gamma, GABARAP, 
PA2G4, MOCS1, CDC10, PK, calreticulin, or homologous proteins to 
detect transformants containing DNA or RNA encoding casein kinase 1 
gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, calreticulin, or 
homologous proteins. As used herein "oligonucleotides" or "oligomers" 
refer to a nucleic acid sequence of at least about 10 nucleotides and as 
many as about 60 nucleotides, preferably about 15 to 30 nucleotides, and 
more preferably about 20-25 nucleotides, which can be used as a probe or 
amplimer. 

The presence of proteins of the invention in a sample can be determined by 
immunological methods or activity measurement. A variety of protocols for 
detecting and measuring the expression of casein kinase 1 gamma, 
GABARAP, PA2G4, MOCS1, CDC10, PK, calreticulin, or homologous 
proteins, using either polyclonal or monoclonal antibodies specific for the 
protein are known in the art. Examples include enzyme-linked 
immunosorbent assay (ELISA), radioimmunoassay (RIA), and fluorescence 
activated cell sorting (FACS). A two-site, monoclonal-based immunoassay 
utilizing monoclonal antibodies reactive to two non-interfering epitopes on 
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casein kinase 1 gamma, GABARAP, PA2G4, MOCS1 , CDC1 0, PK, 
calreticulin, or homologous proteins is preferred, but a competitive binding 
assay may be employed. These and other assays are described, among 
other places, in Hampton, R. et al. (1990; Serological Methods, a 
Laboratory Manual, APS Press, St Paul, Minn.) and Maddox, D. E. et al. 
(1983; J. Exp. Med. 158:1211-1216). 

A wide variety of labels and conjugation techniques are known by those 
skilled in the art and may be used in various nucleic acid and amino acid 
assays. Means for producing labeled hybridization or PGR probes for 
detecting sequences related to polynucleotides encoding casein kinase 1 
gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, calreticulin, or 
homologous proteins include oligo-Iabeling, nick translation, end-labeling or 
PCR amplification using a labeled nucleotide, or enzymatic synthesis. 

Alternatively, the sequences encoding casein kinase 1 gamma, GABARAP, 
PA2G4, MOCS1, CDC10, PK, calreticulin, or homologous proteins, or any 
portions thereof may be cloned into a vector for the production of an 
rnRNA probe. Such vectors are known in the art, are commercially 
available, and may be used to synthesize RNA probes in vitro by addition 
of an appropriate RNA polymerase such as T7, T3, or SP6 and labeled 
nucleotides. These procedures may be conducted using a variety of 
commercially available kits (Pharmacia & Upjohn, (Kalamazoo, Mich.); 
Promega (Madison Wis.); and U.S. Biochemical Corp., (Cleveland, Ohio). 

Suitable reporter molecules or labels, which may be used, include 
radionuclides, enzymes, fluorescent, chemiluminescent, or chromogenic 
agents as well as substrates, co-factors, inhibitors, magnetic particles, and 
the like. 

Host cells transformed with nucleotide sequences encoding casein kinase 
1 gamma," GABARAP, PA2G4, MOCS1, CDC10, PK, calreticulin, or 
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homologous proteins may be cultured under conditions suitable for the 
expression and recovery of the protein from cell culture. The protein 
produced by a recombinant cell may be secreted or contained intracellular^ 
depending on the sequence and/or the vector used. As will be understood 
by those of skill in the art, expression vectors containing polynucleotides 
which encode casein kinase 1 gamma, GABARAP, PA2G4, MOCS1, 
CDC10, PK, calreticulin, or homologous proteins may be designed to 
contain signal sequences, which direct secretion of casein kinase 1 
gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, calreticulin, or 
homologous proteins through a prokaryotic or eukaryotic cell membrane. 
Other recombinant constructions may be used to join sequences encoding 
casein kinase 1 gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, 
calreticulin, or homologous proteins to nucleotide sequence encoding a 
polypeptide domain, which will facilitate purification of soluble proteins. 
Such purification facilitating domains include, but are not limited to, metal 
chelating peptides such as histidine-tryptophan modules that allow 
purification on immobilized metals, protein A domains that allow 
purification on immobilized immunoglobulin, and the domain utilized in the 
FLAG extension/affinity purification system (Immunex Corp., Seattle, 
Wash.) The inclusion of cleavable linker sequences such as those specific 
for Factor XA or Enterokinase (Invitrogen, San Diego, Calif.) between the 
purification domains and casein kinase 1 gamma, GABARAP, PA2G4, 
MOCS1, CDC10, PK, calreticulin, or homologous proteins may be used to 
facilitate purification. 

The nucleic acids encoding the proteins of the invention can be used to 
generate transgenic animal or site specific gene modifications in cell lines. 
Transgenic animals may be made through homologous recombination, 
where the normal locus of the genes encoding the proteins of the invention 
is altered. Alternatively, a nucleic acid construct is randomly integrated into 
the genome. Vectors for stable integration include plasmids, retroviruses 
and other animal viruses, YACs, and the like. The modified cells or animal 
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are useful in the study of the function and regulation of the proteins of the 
invention. For example, a series of small deletions and/or substitutions may 
be made in the genes that encode the proteins of the invention to 
determine the role of particular domains of the protein, functions in 
pancreatic differentiation, etc. 

Specific constructs of interest include anti-sense molecules, which will 
block the expression of the proteins of the invention, or expression of 
dominant negative mutations. A detectable marker, such as for example 
lac-Z, may be introduced in the locus of the genes of the invention, where 
upregulation of expression of the genes of the invention will result in an 
easily detected change in phenotype. 

One may also provide for expression of the genes of the invention or 
variants thereof in cells or tissues where it is not normally expressed or at 
abnormal times of development. In addition, by providing expression of the 
proteins of the invention in cells in which they are not normally produced, 
one can induce changes in cell behavior. 

DNA constructs for homologous recombination will comprise at least 
portions of the genes of the invention with the desired genetic 
modification, and will include regions of homology to the target locus. DNA 
constructs for random integration need not include regions of homology to 
mediate recombination. Conveniently, markers for positive and/or negative 
selection are included. Methods for generating cells having targeted gene 
modifications through homologous recombination are known in the art. For 
embryonic stem (ES) cells, an ES cell line may be employed, or embryonic 
cells may be obtained freshly from a host, e.g. mouse, rat, guinea pig etc. 
Such cells are grown on an appropriate fibroblast-feeder layer or grown in 
presence of leukemia inhibiting factor (LIF). 
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When ES or embryonic cells or somatic pluripotent stem cells have been 
transformed, they may be used to produce transgenic animals. After 
transformation, the cells are plated onto a feeder layer in an appropriate 
medium. Cells containing the construct may be detected by employing a 
selective medium. After sufficient time for colonies to grow/ they are 
picked and analyzed for the occurrence of homologous recombination or 
integration of the construct. Those colonies that are positive may then be 
used for embryo manipulation and blastocyst injection. Blastocysts are 
obtained from 4 to 6 week old superovulated females. The ES cells are 
trypsinized, and the modified cells are injected into the blastocoel of the 
blastocyst. After injection, the blastocysts are returned to each uterine 
horn of pseudopregnant females. Females are then allowed to go to term 
and the resulting offspring screened for the construct. By providing for a 
different phenotype of the blastocyst and the genetically modified cells, 
chimeric progeny can be readily detected. The chimeric animals are 
screened for the presence of the modified gene and males and females 
having the modification are mated to produce homozygous progeny. If the 
gene alterations cause lethality at some point in development, tissues or 
organs can be maintained as allogenic or congenic grafts or transplants, or 
in vitro culture. The transgenic animals may be any non-human mammal, 
such as laboratory animal, domestic animals, etc. The transgenic animals 
may be used in functional studies, drug screening, etc. 



Diagnostics and Therapeutics 

The data disclosed in this invention show that the nucleic acids and 
proteins of the invention and effector molecules thereof are useful in 
diagnostic and therapeutic applications implicated, for example but not 
limited to, in metabolic disorders such as obesity as well as related 
disorders such as eating disorder, cachexia, diabetes mellitus, 
hypertension, coronary heart disease, hypercholesterolemia, dyslipidemia, 
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osteoarthritis, and gallstones. Hence, diagnostic and therapeutic uses for 
the casein kinase 1 gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, or 
calreticulin nucleic acids and proteins, or homologous proteins of the 
invention are, for example but not limited to, the following: (i) protein 
therapeutic, (ii) small molecule drug target, (iii) antibody target 
(therapeutic, diagnostic, drug targeting/cytotoxic antibody), (iv) diagnostic 
and/or prognostic marker, (v) gene therapy (gene delivery/gene ablation), 
(vi) research tools, and (vii) tissue regeneration in vitro and in vivo 
(regeneration for all these tissues and cell types composing these tissues 
and cell types derived from these tissues). 

The nucleic acids and proteins of the invention are useful in diagnostic and 
therapeutic applications implicated in various applications as described 
below. For example, but not limited to, cDNAs encoding the casein kinase 
1 gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, or calreticulin proteins 
of the invention and particularly their human homologues may be useful in 
gene therapy, and the casein kinase 1 gamma, GABARAP, PA2G4, 
MOCS1, CDC10, PK, or calreticulin proteins of the invention and 
particularly their human homologues may be useful when administered to 
a subject in need thereof. By way of non-limiting example, the 
compositions of the present invention will have efficacy for treatment of 
patients suffering from, for example, but not limited to, in metabolic 
disorders as described above. 

The nucleic acid encoding the casein kinase 1 gamma, GABARAP, PA2G4, 
MOCS1, CDC10, PK, or calreticulin proteins of the invention, or 
homologous proteins, or fragments thereof, may further be useful in 
diagnostic applications, wherein the presence or amount of the nucleic 
acids or the proteins are to be assessed. These materials are further useful 
in the generation of antibodies that bind immunospecifically to the novel 
substances of the invention for use in therapeutic or diagnostic methods. 
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For example, in one aspect, antibodies which are specific for casein kinase 
1 gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, calreticulin, or 
homologous proteins may be used directly as an antagonist, or indirectly as 
a targeting or delivery mechanism for bringing a pharmaceutical agent to 
cells or tissue which express casein kinase 1 gamma, GABARAP, PA2G4, 
MOCS1, CDC10, PK, calreticulin, or homologous proteins. The antibodies 
may be generated using methods that are well known in the art. Such 
antibodies may include, but are not limited to, polyclonal, monoclonal, 
chimerical, single chain, Fab fragments, and fragments produced by a Fab 
expression library. Neutralising antibodies, (i.e., those which inhibit dimer 
formation) are especially preferred for therapeutic use. 

For the production of antibodies, various hosts including goats, rabbits, 
rats, mice, humans, and others, may be immunized by injection with casein 
kinase 1 gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, calreticulin, or 
homologous proteins any fragment or oligopeptide thereof which has 
immunogenic properties. Depending on the host species, various adjuvants 
may be used to increase immunological response. Such adjuvants include, 
but are not limited to, Freund's, mineral gels such as aluminium hydroxide, 
and surface active substances such as lysolecithin, pluronic polyols, 
polyanions, peptides, oil emulsions, keyhole limpet hemocyanin, and 
dinitrophenol. Among adjuvants used in human, BCG (Bacille 
Calmette-Guerin) and Corynebacterium parvum are especially preferable. It 
is preferred that the peptides, fragments, or oligopeptides used to induce 
antibodies to casein kinase 1 gamma, GABARAP, PA2G4, MOCS1, 
CDC10, PK, calreticulin, or homologous proteins have an amino acid 
sequence consisting of at least five amino acids, and . more preferably at 
least 10 amino acids. 

Monoclonal antibodies to casein kinase 1 gamma, GABARAP, PA2G4, 
MOCS1 , CDC1 0, PK, calreticulin, or homologous proteins may be prepared 
using any technique that provides for the production of antibody molecules 
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by continuous cell lines in culture. These include, but are not limited to, the 
hybridoma technique, the human B-cell hybridoma technique, and the 
EBV-hybridoma technique (Kohler, G. et al. (1975) Nature 256:495-497; 
Kozbor, D. et al. (1 985) J. Immunol. Methods 81 :31 -42; Cote, R. J. et al. 
Proc. Natl. Acad. Sci. 80:2026-2030; Cole, S. P. et al. (1984) Mol. Cell 
Biol. 62:109-120). 

In addition, techniques developed for the production of "chimeric 
antibodies", the splicing of mouse antibody genes to human antibody 
genes to obtain a molecule with appropriate antigen specificity and 
biological activity can be used (Morrison, S. L. et al. (1984) Proc. Natl. 
Acad. Sci. 81:6851-6855; Neuberger, M. S. et al (1984) Nature 
312:604-608; Takeda, S. et al. (1985) Nature 314:452-454). 
Alternatively, techniques described for the production of single chain 
antibodies may be adapted, using methods known in the art, to produce 
casein kinase 1 gamma, GABARAP, . PA2G4, MOCS1, CDC10, PK, 
calreticulin, or homologous proteins - and -specific single chain antibodies. 
Antibodies with related specificity, but of distinct idiotypic composition, 
may be generated by chain shuffling from random combinatorial 
immunoglobulin libraries (Burton, D. R. (1991) Proc. Natl. Acad. Sci. 
88:11120-3). Antibodies may also be produced by inducing in vivo 
production in the lymphocyte population or by screening recombinant 
immunoglobulin libraries or panels of highly specific binding reagents as 
disclosed in the literature (Orlandi, R. et al. (1989) Proc. Natl. Acad. Sci. 
86:3833-3837; Winter, G. et al. (1991) Nature 349:293-299). 

Antibody fragments, which contain specific binding sites for casein kinase 
1 gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, calreticulin, or 
homologous proteins, may also be generated. For example, such fragments 
include, but are not limited to, the F(ab , ) 2 fragments which can be 
produced by Pepsin digestion of the antibody molecule and the Fab 
fragments which can be generated by reducing the disulfide bridges of 
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F(ab / ) 2 fragments. Alternatively, Fab expression libraries may be 
constructed to allow rapid and easy identification of monoclonal Fab 
fragments with the desired specificity (Huse, W. D. et al. (1989) Science 
254:1275-1281). 

Various immunoassays may be used for screening to identify antibodies 
having the desired specificity. Numerous protocols for competitive binding 
and immunoradiometric assays using either polyclonal or monoclonal 
antibodies with established specificities are well known in the art. Such 
immunoassays typically involve the measurement of complex formation 
between casein kinase 1 gamma, GABARAP, PA2G4, MOCS1, CDC10, 
PK, calreticulin, or homologous proteins and its specific antibody. A 
two-site, monoclonal-based immunoassay utilising monoclonal antibodies 
reactive to two non-interfering casein kinase 1 gamma, GABARAP, PA2G4, 
MOCS1, CDC10, PK, calreticulin, or homologous protein epitopes are 
preferred, but a competitive binding assay may also be employed (Maddox, 
supra). 

In another embodiment of the invention, the polynucleotides encoding 
casein kinase 1 gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, 
calreticulin, or homologous proteins, or any fragment thereof, or nucleic 
acid effector molecules such as antisense molecules, aptamers, RNAi 
molecules or ribozymes may be used for therapeutic purposes. In one 
aspect, aptamers, i.e. nucleic acid molecules, which are capable of binding 
to a protein of the invention and modulating its activity, may be generated 
by a screening and selection procedure involving the use of combinatorial 
nucleic acid libraries. 

In a further aspect, antisense molecules, may be used for therapeutic 
purposes. In one aspect, antisense to the polynucleotide encoding casein 
kinase 1 gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, calreticulin, or 
homologous proteins may be used in situations in which it would be 
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desirable to block the transcription of the mRNA. In particular, cells may be 
transformed with sequences complementary to polynucleotides encoding 
casein kinase 1 gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, 
calreticulin, or homologous proteins. Thus, antisense molecules may be 
used to modulate casein kinase 1 gamma, GABARAP, PA2G4, MOCS1, 
CDC10, PK, calreticulin, or and homologous protein activity, or to achieve 
regulation of gene function. Such technology is now well know in the art, 
and sense or antisense oligomers or larger fragments, can be designed 
from various locations along the coding or control regions of sequences 
encoding casein kinase 1 gamma, GABARAP, PA2G4, MOCS1, CDC10, 
PK, calreticulin, or homologous proteins. Expression vectors derived from 
retroviruses, adenovirus, herpes or vaccinia viruses, or from various 
bacterial plasmids may be used for delivery of nucleotide sequences to the 
targeted organ, tissue or cell population. Methods, which are well known 
to those skilled in the art, can be used to construct recombinant vectors, 
which will express antisense molecules complementary to the 
polynucleotides of the genes encoding casein kinase 1 gamma, GABARAP, 
PA2G4, MOCS1, CDC10, PK, calreticulin, or homologous proteins. These 
techniques are described both in Sambrook et al. (supra) and in Ausubei et 
al. (supra). Genes encoding casein kinase 1 gamma, GABARAP, PA2G4, 
MOCS1 , CDC10, PK, calreticulin, or homologous proteins can be turned off 
by transforming a cell or tissue with expression vectors which express high 
levels of polynucleotide or fragment thereof which encodes casein kinase 
1 gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, calreticulin, or 
homologous proteins. Such constructs may be used to introduce 
untranslatable sense or antisense sequences into a cell. Even in the 
absence of integration into the DNA, such vectors may continue to 
transcribe RNA molecules until they are disabled by endogenous nucleases. 
Transient expression may last for a month or more with a non-replicating 
vector and even longer if appropriate replication elements are part of the 
vector system. 
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As mentioned above, modifications of gene expression can be obtained by 
designing antisense molecules, DNA, RNA, or nucleic acid analogues such 
as PNA, to the control regions of the genes encoding casein kinase 1 
gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, calreticulin, or 
homologous proteins, i.e., the promoters, enhancers, and introns. 
Oligonucleotides derived from the transcription initiation site, e.g., between 
positions -10 and +10 from the start site, are preferred. Similarly, 
inhibition can be achieved using "triple helix" base-pairing methodology. 
Triple helix pairing is useful because it cause inhibition of the ability of the 
double helix to open sufficiently for the binding of polymerases, 
transcription factors, or regulatory molecules. Recent therapeutic advances 
using triplex DNA have been described in the literature (Gee, J. E. et al. 
(1994) In; Huber, B. E. and B. I. Carr, Molecular and Immunologic 
Approaches, Futura Publishing Co., Mt. Kisco, N.Y.). The antisense 
molecules may also be designed to block translation of mRNA by 
preventing the transcript from binding to ribosomes. 

Ribozymes, enzymatic RNA molecules, may also be used to catalyze the 
specific cleavage of RNA. The mechanism of ribozyme action involves 
sequence-specific hybridization of the ribozyme molecule to complementary 
target RNA, followed by endonucleolytic cleavage. Examples, which may 
be used, include engineered hammerhead motif ribozyme molecules that 
can be specifically and efficiently catalyze endonucleolytic cleavage of 
sequences encoding casein kinase 1 gamma, GABARAP, PA2G4, MOCS1, 
CDC10, PK, calreticulin, or homologous proteins. Specific ribozyme 
cleavage sites within any potential RNA target are initially identified by 
scanning the target molecule for ribozyme cleavage sites which include the 
following sequences: GUA, GUU, and GUC. Once identified, short RNA 
sequences of between 15 and 20 ribonucleotides corresponding to the 
region of the target gene containing the cleavage site may be evaluated for 
secondary structural features which may render the oligonucleotide 
inoperable. The suitability of candidate targets may also be evaluated by 
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testing accessibility to hybridization with complementary oligonucleotides 
using ribonuclease protection assays. 

Nucleic acid effector molecules such as antisense molecules and ribozymes 
of the invention may be prepared by any method known in the art for the 
synthesis of nucleic acid molecules. These include techniques for 
chemically synthesizing oligonucleotides such as solid phase 
phosphoramidite chemical synthesis. Alternatively, RNA molecules may be 
generated by in vitro and in vivo transcription of DNA sequences encoding 
casein kinase 1 gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, 
calreticulin, or homologous proteins. Such DNA sequences may be 
incorporated into a variety of vectors with suitable RNA polymerase 
promoters such as T7 or SP6. Alternatively, these cDNA constructs that 
synthesize antisense RNA constitutively or inducibly can be introduced into 
cell lines, cells, or tissues. RNA molecules may be modified to increase 
intracellular stability and half-life. Possible modifications include, but are 
not limited to, the addition of flanking sequences at the 5' and/or 3' ends 
of the molecule or the use of phosphorothioate or 2' O-methyl rather than 
phosphodiesterase linkages within the backbone of the molecule. This 
concept is inherent in the production of PNAs and can be extended in all of 
these molecules by the inclusion of non-traditional bases such as inosine, 
queosine, and wybutosine, as well as acetyl-, methyl-, thio-, and similarly 
modified forms of adenine, cytidine, guanine, thymine, and uridine which 
are not as easily recognized by endogenous endonucleases. 

Many methods for introducing vectors into cells or tissues are available and 
equally suitable for use in vivo, in vitro, and ex vivo. For ex vivo therapy, 
vectors may be introduced into stem cells taken from the patient and 
clonally propagated for autologous transplant back into that same patient. 
Delivery by transfection and by liposome injections may be achieved using 
methods, which are well known in the art. Any of the therapeutic methods 
described above may be applied to any suitable subject including, for 
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example, mammals such as dogs, cats, cows, horses, rabbits, monkeys, 
and most preferably, humans. 

An additional embodiment of the invention relates to the administration of 
a pharmaceutical composition, in conjunction with a pharmaceutical^ 
acceptable carrier, for any of the therapeutic effects discussed above. 
Such pharmaceutical compositions may consist of casein kinase 1 gamma, 
GABARAP, PA2G4, MOCS1, CDC10, PK, calreticulin, or homologous 
proteins, antibodies to casein kinase 1 gamma, GABARAP, PA2G4, 
MOCS1, CDC10, PK, calreticulin, or homologous proteins, mimetics, 
agonists, antagonists, or inhibitors of casein kinase 1 gamma, GABARAP, 
PA2G4, MOCS1, CDC10, PK, calreticulin, or homologous proteins. The 
compositions may be administered alone or in combination with at least 
one other agent, such as stabilizing compound, which may be administered 
in any sterile, biocompatible pharmaceutical carrier, including, but not 
limited to, saline, buffered saline, dextrose, and water. The compositions 
may be administered to a patient alone, or in combination with other 
agents, drugs or hormones. The pharmaceutical compositions utilized in 
this invention may be administered by any number of routes including, but 
not limited to, oral, intravenous, intramuscular, intra-arterial, 
intramedullary, intrathecal, intraventricular, transdermal, subcutaneous, 
intraperitoneal, intranasal, enteral, topical, sublingual, or rectal means. 

In addition to the active ingredients, these pharmaceutical compositions 
may contain suitable pharmaceutically-acceptable carriers comprising 
excipients and auxiliaries, which facilitate processing of the active 
compounds into preparations which, can be used pharmaceutical^. Further 
details on techniques for formulation and administration may be found in 
the latest edition of Remington's Pharmaceutical Sciences (Maack 
Publishing Co., Easton, Pa.). 
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The pharmaceutical compositions of the present invention may be 
manufactured in a manner that is known in the art, e.g., by means of 
conventional mixing, dissolving, granulating, dragee-making, levigating, 
emulsifying, encapsulating, entrapping, or lyophilizing processes. After 
pharmaceutical compositions have been prepared, they can be placed in an 
appropriate container and labeled for treatment of an indicated condition. 
For administration of casein kinase 1 gamma, GABARAP, PA2G4, MOCS1, 
CDC10, PK, calreticulin, or homologous proteins, such labeling would 
include amount, frequency, and method of administration. 

Pharmaceutical compositions suitable for use in the invention include 
compositions wherein the active ingredients are contained in an effective 
amount to achieve the intended purpose. The determination of an effective 
dose is well within the capability of those skilled in the art. For any 
compounds, the therapeutically effective does can be estimated initially 
either in cell culture assays, e.g., of preadipocyte cell lines, or in animal 
models, usually mice, rabbits, dogs, or pigs. The animal model may also be 
used to determine the appropriate concentration range and route of 
administration. Such information can then be used to determine useful 
doses and routes for administration in humans. A therapeutically effective 
dose refers to that amount of active ingredient, for example casein kinase 
1 gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, calreticulin, or 
homologous proteins or fragments thereof, or antibodies, which is effective 
against a specific condition. Therapeutic efficacy and toxicity may be 
determined by standard pharmaceutical procedures in cell cultures or 
experimental animals, e.g., ED50 (the dose therapeutically effective in 50% 
of the population) and LD50 (the dose lethal to 50% of the population). 
The dose ratio between therapeutic and toxic effects is the therapeutic 
index, and it can be expressed as the ratio, LD50/ED50, Pharmaceutical 
compositions, which exhibit large therapeutic indices, are preferred. The 
data obtained from cell culture assays and animal studies is used in 
formulating a range of dosage for human use. The dosage contained in 
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such compositions is preferably within a range of circulating concentrations 
that include the ED50 with little or no toxicity. The dosage varies within 
this range depending upon the dosage being employed, the sensitivity of 
the patient, and the route of administration. The exact dosage will be 
determined by the practitioner, in light of factors related to the subject that 
requires treatment. Dosage and administration are adjusted to provide 
sufficient levels of the active moiety or to maintain the desired effect. 
Factors, which may be taken into account, include the severity of the 
disease state, general health of the subject, age, weight, and gender of the 
subject, diet, time and frequency of administration, drug combination(s), 
reaction sensitivities, and tolerance/response to therapy. Long-acting 
pharmaceutical compositions may be administered every 3 to 4 days, every 
week, or once every two weeks depending on half-life and clearance rate 
of the particular formulation. Normal dosage amounts may vary from 0.1 to 
100,000 micrograms, up to a total dose of about 1 g, depending upon the 
route of administration. Guidance as to particular dosages and methods of 
delivery is provided in the literature and generally available to practitioners 
in the art. Those skilled in the art employ different formulations for 
nucleotides than for proteins or their inhibitors. Similarly, delivery of 
polynucleotides or polypeptides will be specific to particular cells, 
conditions, locations, etc. 

In another embodiment, antibodies which specifically bind casein kinase 1 
gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, calreticulin, or 
homologous proteins may be used for the diagnosis of conditions or 
diseases characterized by or associated with over- or underexpression of 
casein kinase 1 gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, 
calreticulin, or homologous proteins, or in assays to monitor patients being 
treated with casein kinase 1 gamma, GABARAP, PA2G4, MOCS1, CDC10, 
PK, calreticulin, or homologous proteins, agonists, antagonists or inhibitors. 
The antibodies useful for diagnostic purposes may be prepared in the same 
manner as those described above for therapeutics. Diagnostic assays for 
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casein kinase 1 gamma, GABARAP, PA2G4, MOCS1 , CDC10, PK, 
calreticulin, or homologous proteins include methods, which utilize the 
antibody and a label to detect casein kinase 1 gamma, GABARAP, PA2G4, 
MOCS1, CDC10, PK, calreticulin, or homologous proteins in human body 
fluids or extracts of cells or tissues. The antibodies may be used with or 
without modification, and may be labeled by joining them, either covalently 
or non-covalently, with a reporter molecule. A wide variety of reporter 
molecules, which are known in the art may be used several of which are 
described above. 

A variety of protocols including ELISA, RIA, and FACS for measuring 
casein kinase 1 gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, 
calreticulin, or homologous proteins are known in the art and provide a 
basis for diagnosing altered or abnormal levels of casein kinase 1 gamma, 
GABARAP, PA2G4, MOCS1, CDC10, PK, calreticulin, or homologous 
protein expression. Normal or standard values for casein kinase 1 gamma, 
GABARAP, PA2G4, MOCS1, CDC10, PK, calreticulin, or homologous 
protein expression are established by combining body fluids or cell extracts 
taken from normal mammalian subjects, preferably human, with antibody 
to casein kinase 1 gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, 
calreticulin, or homologous proteins under conditions suitable for complex 
formation. The amount of standard complex formation may be quantified 
by various methods, but preferably by photometric means. Quantities of 
casein kinase 1 gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, 
calreticulin, or homologous proteins expressed in control and disease, 
samples, e.g. from biopsied tissues are compared with the standard values. 
Deviation between standard and subject values establishes the parameters 
for diagnosing disease. 

In another embodiment of the invention, the polynucleotides encoding 
casein kinase 1 gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, 
calreticulin, or homologous proteins may be used for diagnostic purposes. 
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The polynucleotides, which may be used, include oligonucleotide 
sequences, antisense RNA and DNA molecules, and PNAs. The 
polynucleotides may be used to detect and quantitate gene expression in 
biopsied tissues in which expression of casein kinase 1 gamma, GABARAP, 
PA2G4, MOCS1, CDC10, PK, calreticulin, or homologous proteins may be 
correlated with disease. The diagnostic assay may be used to distinguish 
between absence, presence, and excess expression of casein kinase 1 
gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, calreticulin, or 
homologous proteins, and to monitor regulation of casein kinase 1 gamma, 
GABARAP, PA2G4, MOCS1, CDC10, PK, calreticulin, or homologous 
protein levels during therapeutic intervention. 

In one aspect, hybridization with probes which are capable of detecting 
polynucleotide sequences, including genomic sequences, encoding casein 
kinase 1 gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, calreticulin, or 
homologous proteins closely related molecules, may be used to identify 
nucleic acid sequences which encode casein kinase 1 gamma, GABARAP, 
PA2G4, MOCS1, GDC10, PK, calreticulin, or homologous proteins. The 
specificity of the probe, whether it is made from a highly specific region, 
e.g., unique nucleotides in the 5' regulatory region, or a less specific 
region, e.g., especially in the 3' coding region, and the stringency of the 
hybridization or amplification (maximal, high, intermediate, or low) will 
determine whether the probe identifies only naturally occurring sequences 
encoding casein kinase 1 gamma, GABARAP, PA2G4, MOCS1, CDC10, 
PK, calreticulin, or homologous proteins, alleles, or related sequences. 
Probes may also be used for the detection of related sequences, and 
should preferably contain at least 50% of the nucleotides from any of the 
casein kinase 1 gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, 
calreticulin, or homologous proteins encoding sequences. The hybridization 
probes of the subject invention may be DNA or RNA and derived from the 
nucleotide sequence of the polynucleotide comprising (i) Drosophila 
gilgamesh (gish), human casein kinase 1, gamma 1, human casein kinase 
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1 , gamma 2, human casein kinase 1, gamma 3 , (ii) Drosophila Gadfly 
Accession Number CG1534, human GABARAP, human GABARAP like 1, 
human GABARAP like, human GABARAP like 3, (iii) Drosophila Gadfly 
Accession Number CG 10576, human PA2G4, (iv) Drosophila Mocsl, 
human MOCSA, human MOCS1 isoform 1, human MOCS1 isoform 2, 
human MOCS1 isoform 3, (v) Drosophila peanut (pnut), human CDC10, (vi) 
Drosophila Gadfly Accession Number CG7069, human pyruvate kinase, 
muscle, human pyruvate kinase, liver and RBC, (vii) Drosophila calreticulin 
(Crc), human calreticulin, or human calreticulin 2, or from genomic 
sequence including promoter, enhancer elements, and introns of the 
naturally occurring casein kinase 1 gamma, GABARAP, PA2G4, MOCS1, 
CDC10, PK, calreticulin, or homologous proteins. Means for producing 
specific hybridization probes for DNAs encoding casein kinase 1 gamma, 
GABARAP, PA2G4, MOCS1, CDC10, PK, calreticulin, or homologous 
proteins include the cloning of nucleic acid sequences encoding casein 
kinase 1 gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, calreticulin, or 
homologous protein derivatives into vectors for the production of mRNA 
probes. Such vectors are known in the art, commercially available, and 
may be used to synthesize RNA probes in vitro by means of the addition of 
the appropriate RNA polymerases and the appropriate labeled nucleotides. 
Hybridization probes may be labeled by a variety of reporter groups, for 
example, radionuclides such as 32 P or 35 S, or enzymatic labels, such as 
alkaline phosphatase coupled to the probe via avidin/biotin coupling 
systems, and the like. 

Polynucleotide sequences encoding casein kinase 1 gamma, GABARAP, 
PA2G4, MOCS1, CDC10, PK, calreticulin, or homologous proteins may be 
used for the diagnosis of conditions or diseases, which are associated with 
expression of casein kinase 1 gamma, GABARAP, PA2G4, MOCS1, 
CDC10, PK, calreticulin, or homologous proteins. Examples of such 
conditions or diseases include, but are not limited to, pancreatic diseases 
and disorders, including diabetes. Polynucleotide sequences encoding 
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casein kinase 1 gamma, GABARAP, PA2G4, MOCS1 , CDC10, PK, 
calreticulin, or homologous proteins may also be used to monitor the 
progress of patients receiving treatment for pancreatic diseases and 
disorders, including diabetes. The polynucleotide sequences encoding 
casein kinase 1 gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, 
calreticulin, or homologous proteins may be used in Southern or Northern 
analysis, dot blot, or other membrane-based technologies; in PCR 
technologies; or in dip stick, pin, ELISA or chip assays utilizing fluids or 
tissues from patient biopsies to detect altered casein kinase 1 gamma, 
GABARAP, PA2G4, MOCS1, CDC10, PK, calreticulin, or homologous 
protein expression. Such qualitative or quantitative methods are well 
known in the art. 

In a particular aspect, the nucleotide sequences encoding casein kinase 1 
gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, calreticulin, or 
homologous proteins may be useful in assays that detect activation or 
induction of various metabolic diseases such as obesity as well as related 
disorders such as eating disorder, cachexia, diabetes mellitus, 
hypertension, coronary heart disease, hypercholesterolemia, dyslipidemia, 
osteoarthritis, and gallstones. The nucleotide sequences encoding casein 
kinase 1 gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, calreticulin, or 
homologous proteins may be labeled by standard methods, and added to a 
fluid or tissue sample from a patient under conditions suitable for the 
formation of hybridization complexes. After a suitable incubation period, 
the sample is washed and the signal is quantitated and compared with a 
standard value. The presence of altered levels of nucleotide sequences 
encoding casein kinase 1 gamma, GABARAP, PA2G4, MOCS1, CDC10, 
PK, calreticulin, or homologous proteins in the sample indicates the 
presence of the associated disease. Such assays may also be used to 
evaluate the efficacy of a particular therapeutic treatment regimen in 
animal studies, in clinical trials, or in monitoring the treatment of an 
individual patient. 
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In order to provide a basis for the diagnosis of disease associated with 
expression of casein kinase 1 gamma, GABARAP, PA2G4, MOCS1 , 
CDC10, PK, calreticulin, or homologous proteins, a normal or standard 
profile for expression is established. This may be accomplished by 
combining body fluids or cell extracts taken from normal subjects, either 
animal or human, with a sequence, or a fragment thereof, which encodes 
casein kinase 1 gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, 
calreticulin, or homologous proteins, under conditions suitable for 
hybridization or amplification. Standard hybridization may be quantified by 
comparing the values obtained from normal subjects with those from an 
experiment where a known amount of a substantially purified 
polynucleotide is used. Standard values obtained from normal samples may 
be compared with values obtained from samples from patients who are 
symptomatic for disease. Deviation between standard and subject values 
is used to establish the presence of disease. Once disease is established 
and a treatment protocol is initiated, hybridization assays may be repeated 
on a regular basis to evaluate whether the level of expression in the patient 
begins to approximate that, which is observed in the normal patient. The 
results obtained from successive assays may be used to show the efficacy 
of treatment over a period ranging from several days to months. 

With respect to metabolic diseases such as described above, the presence 
of a relatively high amount of transcript in biopsied tissue from an 
individual may indicate a predisposition for the development of the disease, 
or may provide a means for detecting the disease prior to the appearance 
of actual clinical symptoms. A more definitive diagnosis of this type may 
allow health professionals to employ preventative measures or aggressive 
treatment earlier thereby preventing the development or further progression 
of the pancreatic diseases and disorders. 

Additional diagnostic uses for oligonucleotides designed from the 
sequences encoding casein kinase 1 gamma, GABARAP, PA2G4, MOCS1 , 
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CDC 10, PK, calreticulin, or homologous proteins may involve the use of 
PCR. Such oligomers may be chemically synthesized, generated 
enzymatically, or produced from a recombinant source. Oligomers will 
preferably consist of two nucleotide sequences, one with sense orientation 
(5'.fwdarw.3') and another with antisense (3'.rarw.5'), employed under 
optimized conditions for identification of a specific gene or condition. The 
same two oligomers, nested sets of oligomers, or even a degenerate pool 
of oligomers may be employed under less stringent conditions for detection 
and/or quantification of closely related DNA or RNA sequences. 

Methods which may also be used to quantitate the expression of casein 
kinase 1 gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, or calreticulin 
include radiolabeling or biotinylating nucleotides, coamplification of a 
control nucleic acid, and standard curves onto which the experimental 
results are interpolated (Melby, P. C. et al. (1993) J. Immunol. Methods, 
159:235-244; Duplaa, C. etal. (1993) Anal. Biochem. 212:229-236). The 
speed of quantification of multiple samples may be accelerated by running 
the assay in an ELISA format where the oligomer of interest is presented in 
various dilutions and a spectrophotometric or colorimetric response gives 
rapid quantification. 

In another embodiment of the invention, the nucleic acid sequences, which 
encode casein kinase 1 gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, 
calreticulin, or homologous proteins, may also be used to generate 
hybridization probes, which are useful for mapping the naturally occurring 
genomic sequence. The sequences may be mapped to a particular 
chromosome or to a specific region of the chromosome using well known 
techniques. Such techniques include FISH, FACS, or artificial chromosome 
constructions, such as yeast artificial chromosomes, bacterial artificial 
chromosomes, bacterial P1 constructions or single chromosome cDNA 
libraries as reviewed in Price, C. M. (1993) Blood Rev. 7:127-134, and 
Trask, B. J. (1991) Trends Genet. 7:149-154. FISH (as described in Verma 
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et al. (1988) Human Chromosomes: A Manual of Basic Techniques, 
Pergamon Press, New York, N.Y.) may be correlated with other physical 
chromosome mapping techniques and genetic map data. Examples of 
genetic map data can be found in the 1994 Genome Issue of Science 
(265:1 981 f). Correlation between the location of the gene encoding casein 
kinase 1 gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, or calreticulin 
on a physical chromosomal map and a specific disease, or predisposition to 
a specific disease, may help to delimit the region of DNA associated with 
that genetic disease. 

The nucleotide sequences of the subject invention may be used to detect 
differences in gene sequences between normal, carrier, or affected 
individuals, in situ hybridisation of chromosomal preparations and physical 
mapping techniques such as linkage analysis using established 
chromosomal markers may be used for extending genetic maps. Often the 
placement of a gene on the chromosome of another mammalian species, 
such as mouse, may reveal associated markers even if the number or arm 
of a particular human chromosome is not known. New sequences can be 
assigned to chromosomal arms, or parts thereof, by physical mapping. This 
provides valuable information to investigators searching for disease genes 
using positional cloning or other gene discovery techniques. Once the 
disease or syndrome has been crudely localised by genetic linkage to a 
particular genomic region, for example, AT to 1 1 q22-23 (Gatti, R. A. et al. 
(1988) Nature 336:577-580), any sequences mapping to that area may 
represent associated or regulatory genes for further investigation. The 
nucleotide sequences of the subject invention may also be used to detect 
differences in the chromosomal location due to translocation, inversion, 
etc. among normal, carrier, or affected individuals. 

In another embodiment of the invention, the proteins, their catalytic or 
immunogenic fragments or oligopeptides thereof, an in vitro model, a 
genetically altered cell or animal, can be used for screening libraries of 
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compounds in any of a variety of drug screening techniques. One can 
identify effectors, e.g. receptors, enzymes, proteins, ligands, or substrates 
that bind to, modulate or mimic the action of one or more of the proteins 
of the invention. The protein or fragment thereof employed in such 
screening may be free in solution, affixed to a solid support, borne on a cell 
surface, or located intracellularly. The formation of binding complexes, 
between the protein and the agent tested, may be measured. Agents could 
also, either directly or indirectly, influence the activity of the proteins of 
the invention. 

Candidate agents may also be found in kinase assays where a kinase 
substrate such as a protein or a peptide, which may or may not include 
modifications as further described below, or others are phosphorylated by 
the proteins or protein fragments of the invention. A therapeutic candidate 
agent may be identified by its ability to increase or decrease the enzymatic 
activity of the proteins of the invention. The kinase activity may be 
detected by change of the chemical, physical or immunological properties 
of the substrate due to phosphorylation. 

One example could be the transfer of radioisotopically labelled phosphate 
groups from an appropriate donor molecule to the kinase substrate 
catalyzed by the polypeptides of the invention. The phosphorylation of the 
substrate may be followed by detection of the substrates autoradiography 
with techniques well known in the art. 

Yet in another example, the change of mass of the substrate due to its 
phosphorylation may be detected by mass spectrometry techniques. 

One could also detect the phosphorylation status of a substrate with an 
analyte discriminating between the phosphorylated and unphosphorylated 
status of the substrate. Such an analyte may act by having different 
affinities for the phosphorylated and unphosphorylated forms of the 
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substrate or by having specific affinity for phosphate groups. Such an 
analyte could be, but is not limited to an antibody or antibody derivative, a 
recombinant antibody-like structure, a protein, a nucleic acid, a molecule 
containing a complexed metal ion, an anion exchange chromatography 
matrix, an affinity chromatography matrix or any other molecule with 
phosphorylation dependend selectivity towards the substrate. 

Such an analyte could be employed to detect the kinase substrate, which 
is immobilized on a solid support during or after an enzymatic reaction. If 
the analyte is an antibody, its binding to the substrate could be detected 
by a variety of techniques as they are described in Harlow and Lane, 1 998, 
Antibodies, CSH Lab Press, NY. If the analyte molecule is not an antibody, 
it may be detected by virtue of its chemical, physical or immunological 
properties, being endogenously associated with it or engineered to it. 

Yet in another example the kinase substrate may have features, designed 
or endogenous, to facilitate its binding or detection in order to generate a 
signal that is suitable for the analysis of the substrates phosphorylation 
status. These features may be, but are not limited to a biotin molecule or 
derivative thereof, a glutathione-S-transferase moiety, a moiety of six or 
more consecutive histidine residues, an amino acid sequence or hapten to 
function as an epitope tag, a fluorochrome, an enzyme or enzyme 
fragment. The kinase substrate may be linked to these or other features 
with a molecular spacer arm to avoid steric hindrance. 

In one example the kinase substrate may be labelled with a fluorochrome. 
The binding of the analyte to the labelled substrate in solution may be 
followed by the technique of fluorescence polarization as it is described in 
the literature (see, for example, Deshpande, S. et al. (1 999) Prog. Biomed. 
Optics (SPIE) 3603:261; Parker, G. J. et al. (2000) J. Biomol. Screen. 
5:77-88; Wu, P. et al. (1997) Anal. Biochem. 249:29-36). In a variation of 
this example, a fluorescent tracer molecule may compete with the 
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substrate for the analyte to detect kinase activity by a technique which is 
know to those skilled in the art as indirect fluorescence polarization. 

In vivo, the enzymatic kinase activity of the unmodified polypeptides of 
casein kinase 1 gamma and pyruvate kinase towards a substrate can be 
enhanced by appropriate stimuli, triggering the phosphorylation of casein 
kinase 1 gamma and pyruvate kinase. This may be induced in the natural 
context by extracellular or intracellular stimuli, such as signaling molecules 
or environmental influences. One may generate a system containing 
activated casein kinase 1 gamma and pyruvate kinase, may it be an 
organism, a tissue, a culture of cells or cell-free environment, by 
exogenously applying this stimulus or by mimicking this stimulus by a 
variety of the techniques, some of them described further below. A system 
containing activated casein kinase 1 gamma and pyruvate kinase may be 
produced (i) for the purpose of diagnosis, study, prevention, and treatment 
of diseases and disorders related to body-weight regulation and 
thermogenesis, for example, but not limited to, metabolic diseases such as 
obesity, as well as related disorders such as eating disorder, cachexia, 
diabetes mellitus, hypertension, coronary heart disease, 
hypercholesterolemia, dyslipidemia, osteoarthritis, and gallstones. 

In addition activity of casein kinase 1 gamma, GABARAP, PA2G4, MOCS1 , 
CDC10, PK, or calreticulin against its physiological substrate(s) or 
derivatives thereof could be measured in cell-based assays. Agents may 
also interfere with posttranslational modifications of the protein, such as 
phosphorylation and dephosphorylation, farnesylation, palmitoylation, 
acetylation, alkylation, ubiquitination, proteolytic processing, subcellular 
localization and degradation. Moreover, agents could influence the 
dimerization or oligomerization of the proteins of the invention or, in a 
heterologous manner, of the proteins of the invention with other proteins, 
for example, but not exclusively, docking proteins, enzymes, receptors, or 
translation factors. Agents could also act on the physical interaction of the 
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proteins of this invention with other proteins, which are required for protein 
function, for example, but not exclusively, their downstream signaling. 

Methods for determining protein-protein interaction are well known in the 
art. For example binding of a fluorescently labeled peptide derived from the 
interacting protein to the protein of the invention, or vice versa, could be 
detected by a change in polarisation. In case that both binding partners, 
which can be either the full length proteins as well as one binding partner 
as the full length protein and the other just represented as a peptide are 
fluorescently labeled, binding could be detected by fluorescence energy 
transfer (FRET) from one fluorophore to the other. In addition, a variety of 
commercially available assay principles suitable for detection of 
protein-protein interaction are well known In the art, for example but not 
exclusively AlphaScreen (PerkinEImer) or Scintillation Proximity Assays 
(SPA) by Amersham. Alternatively, the interaction of the proteins of the 
invention with cellular proteins could be the basis for a cell-based screening 
assay, in which both proteins are fluorescently labeled and interaction of 
both proteins is detected by analysing cotranslocation of both proteins with 
a cellular imaging reader, as has been developed for example, but not 
exclusively, by Cellomics or EvotecOAI. In all cases the two or more 
binding partners can be different proteins with one being the protein of the 
invention, or in case of dimerization and/or oligomerization the protein of 
the invention itself. Proteins of the invention, for which one target 
mechanism of interest, but not the only one, would be such protein/protein 
interactions are casein kinase 1 gamma, GABARAP, PA2G4, MOCS1, 
CDC 10, PK, and calreticulin. 

Assays for determining enzymatic activity of the proteins of the invention 
are well known in the art. 

Of particular interest are screening assays for agents that have a low 
toxicity for mammalian cells. The term "agent" as used herein describes 
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any molecule, e.g. protein or pharmaceutical, with the capability of altering 
or mimicking the physiological function of one or more of the proteins of 
the invention. Candidate agents encompass numerous chemical classes, 
though typically they are organic molecules, preferably small organic 
compounds having a molecular weight of more than 50 and less than 
about 2,500 Daltons. Candidate agents comprise functional groups 
necessary for structural interaction with proteins, particularly hydrogen 
bonding, and typically include at least an amine, carbonyl, hydroxy! or 
carboxyl group, preferably at least two of the functional chemical groups. 
The candidate agents often comprise carbocyclic or heterocyclic structures 
and/or aromatic or polyaromatic structures substituted with one or more of 
the above functional groups. 

Candidate agents are also found among biomolecules including peptides, 
saccharides, fatty acids, steroids, purines, pyrimidines, nucleic acids and 
derivatives, structural analogs or combinations thereof. Candidate agents 
are obtained from a wide variety of sources including libraries of synthetic 
or natural compounds. For example, numerous means are available for 
random and directed synthesis of a wide variety of organic compounds and 
biomolecules, including expression of randomized oligonucleotides and 
oligopeptides. Alternatively, libraries of natural compounds in the form of 
bacterial, fungal, plant and animal extracts are available or readily 
produced. Additionally, natural or synthetically produced libraries and 
compounds are readily modified through conventional chemical, physical 
and biochemical means, and may be used to produce combinatorial 
libraries. Known pharmacological agents may be subjected to directed or 
random chemical modifications, such as acylation, alkylation, esterification, 
amidification, etc. to produce structural analogs. Where the screening 
assay is a binding assay, one or more of the molecules may be joined to a 
label, where the label can directly or indirectly provide a detectable signal. 
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Another technique for drug screening, which may be used, provides for 
high throughput screening of compounds having suitable binding affinity to 
the protein of interest as described in published PCT application 
WO84/03564. In this method, as applied to the protein of the invention 
large numbers of different small test compounds, e.g. aptamers, peptides, 
low-molecular weight compounds, etc. are synthesized on a solid 
substrate, such as plastic pins or some other surface. The test compounds 
are reacted with the protein, or fragments thereof, and washed. Bound 
proteins are then detected by methods well known in the art. Purified 
proteins can also be coated directly onto plates for use in the 
aforementioned drug screening techniques. Alternatively, non-neutralizing 
antibodies can be used to capture the peptide and immobilise it on a solid 
support. In another embodiment, one may use competitive drug screening 
assays in which neutralizing antibodies capable of binding the protein 
specifically compete with a test compound for binding the protein. In this 
manner, the antibodies can be used to detect the presence of any peptide, 
which shares one or more antigenic determinants with the protein of the 
invention. 

The nucleic acids encoding the proteins of the invention can be used to 
generate transgenic cell lines and animals. These transgenic non-human 
animals are useful in the study of the function and regulation of the 
proteins of the invention in vivo. Transgenic animals, particularly 
mammalian transgenic animals, can serve as a model system for the 
investigation of many developmental and cellular processes common to 
humans. A variety of non-human models of metabolic disorders can be 
used to test modulators of the protein of the invention. Misexpression (for 
example, overexpression or lack of expression) of the protein of the 
invention, particular feeding conditions, and/or administration of 
biologically active compounts can create models of metablic disorders. 
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In one embodiment of the invention, such assays use mouse models of 
insulin resistance and/or diabetes, such as mice carrying gene knockouts in 
the leptin pathway (for example, ob (leptin) or db (leptin receptor) mice). 
Such mice develop typical symptoms of diabetes , show hepatic lipid 
accumulation and frequently have increased plasma lipid levels (see 
Bruning et al, 1998, Mol. Cell. 2:449-569). Susceptible wild type mice (for 
example C57BI/6) show similiar symptoms if fed a high fat diet. In addition 
to testing the expression of the proteins of the invention in such mouse 
strains (see EXAMPLE 4), these mice could be used to test whether 
administration of a candidate modulator alters for example lipid 
accumulation in the liver, in plasma, or adipose tissues using standard 
assays well known in the art, such as FPLC, colorimetric assays, blood 
glucose level tests, insulin tolerance tests and others. 

Transgenic animals may be made through homologous recombination in 
embryonic stem cells, where the normal locus of the gene encoding the 
protein of the invention is mutated. Alternatively, a nucleic acid construct 
encoding the protein is injected into oocytes and is randomly integrated 
into the genome. One may also express the genes of the invention or 
variants thereof in tissues where they are not normally expressed or at 
abnormal times of development. Furthermore, variants of the genes of the 
invention like specific constructs expressing anti-sense molecules or 
expression of dominant negative mutations, which will block or alter the 
expression of the proteins of the invention may be randomly integrated into 
the genome. A detectable marker, such as lac Z or luciferase may be 
introduced into the locus of the genes of the invention, where upreguiation 
of expression of the genes of the invention will result in an easily 
detectable change in phenotype. Vectors for stable integration include 
plasmids, retroviruses and other animal viruses, yeast artificial 
chromosomes (YACs), and the like. DNA constructs for homologous 
recombination will contain at least portions of the genes of the invention 
with the desired genetic modification, and will include regions of homology 
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to the target locus. Conveniently, markers for positive and negative 
selection are included. DNA constructs for random integration do not need 
to contain regions of homology to mediate recombination. DNA constructs 
for random integration will consist of the nucleic acids encoding the 
proteins of the invention, a regulatory element (promoter), an intron and a 
poly-adenylation signal. Methods for generating cells having targeted gene 
modifications through homologous recombination are known in the field. 
For embryonic stem (ES) cells, an ES cell line may be employed, or 
embryonic cells may be obtained freshly from a host, e.g. mouse, rat, 
guinea pig, etc. Such cells are grown on an appropriate fibroblast-feeder 
layer and are grown in the presence of leukemia inhibiting factor (LIF). ES 
or embryonic cells may be transfected and can then be used to produce 
transgenic animals. After transfection, the ES cells are plated onto a feeder 
layer in an appropriate medium. Cells containing the construct may be 
selected by employing a selection medium. After sufficient time for 
colonies to grow, they are picked and analyzed for the occurrence of 
homologous recombination. Colonies that are positive may then be used for 
embryo manipulation and morula aggregation. Briefly, morulae are obtained 
from 4 to 6 week old superovulated females, the Zona Pellucida is removed 
and the morulae are put into small depressions of a tissue culture dish. The 
ES cells are trypsinized, and the modified cells are placed into the 
depression closely to the morulae. On the following day the aggregates are 
transfered into the uterine horns of pseudopregnant females. Females are 
then allowed to go to term. Chimeric offsprings can be readily detected by 
a change in coat color and are subsequently screened for the transmission 
of the mutation into the next generation (F1 -generation). Offspring of the 
Fl -generation are screened for the presence of the modified gene and 
males and females* having the modification are mated to produce 
homozygous progeny. If the gene alterations cause lethality at some point 
in development, tissues or organs can be maintained as allogenic or 
congenic grafts or transplants, or in vitro culture. The transgenic animals 
may be any non-human mammal, such as laboratory animal, domestic 
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animals, etc., for example, mouse, rat, guinea pig, sheep, cow, pig, and 
others. The transgenic animals may be used in functional studies, drug 
screening, and other applications and are useful in the study of the 
function and regulation of the proteins of the invention in vivo. 

Finally, the invention also relates to a kit comprising at least one of 

(a) a casein kinase 1 gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, 
or calreticulin nucleic acid molecule or a fragment thereof; 

(b) a casein kinase 1 gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, 
or calreticulin amino acid molecule or a fragment or an isoform 
thereof; 

(c) a vector comprising the nucleic acid of (a); 

(d) a host cell comprising the nucleic acid of (a) or the vector of (b); 

(e) a polypeptide encoded by the nucleic acid of (a); 

(f) a fusion polypeptide encoded by the nucleic acid of (a); 

(g) an antibody, an aptamer or another receptor against the nucleic acid 
of (a) or the polypeptide of (b), (e) or (f) and 

(h) an anti-sense oligonucleotide of the nucleic acid of (a). 

The kit may be used for diagnostic or therapeutic purposes or for screening 
applications as described above. The kit may further contain user 
instructions. 

The Figures show: 

Figure 1 shows the triglyceride content of a gilgamesh casein kinase 1 
(gish; Gadfly Accession Number CG6963) mutant. Shown is the increase 
of triglyceride content of HD-EP(3)37409 flies (referred to as 
"HD-EP37409" in column 2) caused by homozygous viable integration of 
the P-vector into the promotor region of the second transcription unit of 
gilgamesh, in comparison to controls with integration of this vector type 
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(referred to as "EP-control" in column 1). Also shown is that ectopic 
expression of the gilgamesh gene mainly in the fat body of the flies 
(referred to as "HD-EP37409/FB" in column 4) in comparison to controls 
with integration of this vector type (referred to as "random EP/FB" in 
column 3) causes no change of triglyceride content, and that ectopic 
expression of the gilgamesh gene mainly in the neurons of the flies 
(referred to as "HD-EP37409/elav" in column 6) in comparison to controls 
with integration of this vector type (referred to as "random EP/elav" in 
column 5) causes a decrease of triglyceride content. 

Figure 2 shows the molecular organisation of the mutated gilgamesh casein 
kinase 1 (Gadfly Accession Number CG6963) gene locus. 

Figure 3 shows the human homologs of Gadfly Accession Number CG6963 
(gilgamesh) 

Figure 3A. BLASTP search result for Gadfly Accession Number CG6963 
(Query) with the best human homolog match (Sbject) 
Figure 3B shows the nucleotide sequence encoding human casein kinase 1 , 
gamma 1 (Genbank Accession Number AB042563; SEQ ID NO:1) 
Figure 3C shows the amino acid sequence of human casein kinase 1, 
gamma 1 (Genbank Accession Number Q9HCP0; SEQ ID NO:2) 
Figure 3D shows the nucleotide sequence encoding human casein kinase 
1, gamma 2 (Genbank Accession Number NM_001319; SEQ ID NO:3) 
Figure 3E shows the amino acid sequence of human casein kinase 1, 
gamma 2 (Genbank Accession Number NP_001310; SEQ ID N0:4) 
Figure 3F shows the nucleotide sequence encoding human casein kinase 1 , 
gamma 3 (Genbank Accession Number NM_004384; SEQ ID NO:5) 
Figure 3G shows the amino acid sequence of human casein kinase 1, 
gamma 3 (Genbank Accession Number NP_004375; SEQ ID NO:6). 

Figure 4 shows the comparison (Clustal W (1.83) protein sequence 
alignment analysis) of human and Drosophila casein kinase 1 proteins. 
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Gaps in the alignment are represented as -. In the figure 'CK1 g3 Hs' refers 
to human casein kinase 1 gamma 3, 'CK1 g1 Hs' refers to human casein 
kinase 1 gamma 1 , 'CK1 g2 Hs' refers to human casein kinase 1 gamma 2, 
and 'CG6963 Dm' refers to the Drosophila gilgamesh gene product with 
Gadfly Accession Number CG6963. 

Figure 5 shows the analysis of casein kinase 1, gamma 1 and casein 
kinase 1, gamma 3 expression in mammalian tissues. The relative 
RNA-expression is shown on the Y-axis, the tissues tested are given on the 
X-axis. WAT refers to white adipose tissue, BAT refers to brown adipose 
tissue. 

Figure 5A shows the real-time PGR analysis of casein kinase 1, gamma 1 
expression in mouse wildtype tissues. 

Figure 5B shows the real-time PCR analysis of casein kinase 1, gamma 1 
expression in wildtype mice (WT-mice), compared to genetically obese 
mice (ob/ob-mice) and to fasted mice (fasted-mice). 

Figure 5C shows the real-time PCR analysis of casein kinase 1, gamma 3 
expression in mouse wildtype tissues. WAT refers to white adipose tissue, 
BAT refers to brown adipose tissue. 

Figure 5D shows the real-time PCR analysis of casein kinase 1, gamma 3 
expression in wildtype mice (WT-mice), compared to genetically obese 
mice (ob/ob-mice) and to fasted mice (fasted-mice). 

Figure 6 shows the decrease of triglyceride content of PX6298.1 flies 
caused by integration of the P-vector (in comparison to controls with 
integration of these vectors elsewhere in genome). 

Figure 7 shows the molecular organisation of the mutated GABARAP 
(Gadfly Accession Number CG1534) gene locus. The Gadfly Accession 
Number CGI 534 annotated gene encodes three different transcripts. Only 
one of these transcripts encodes GABARAP (Gadfly Accession Number 
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CT3947; synonym for Gadfly Accession Number CG32672 and Genbank 
Accession Number. NM__1 67245). 

Figure 8 shows the human homologs of Gadfly Accession Number CG1 534 

Figure 8A. BLASTP search result for Gadfly Accession Number CG1534 

(Query) with the best human homolog match (Sbject) 

Figure 8B shows the nucleotide sequence encoding human GABARAP 

(Genbank Accession Number NM 007278; SEQ ID NO:7) 

Figure 8C shows the amino acid sequence of human GABARAP (Genbank 

Accession Number NP_009209; SEQ ID NO:8) 

Figure 8D shows the nucleotide sequence encoding GABARAP like 1 

(Genbank Accession Number NM_031412; SEQ ID NO:9) 

Figure 8E shows the amino acid sequence of human GABARAP like 1 

(Genbank Accession Number NP_1 13600; SEQ ID NO: 10) 

Figure 8F shows the nucleotide sequence encoding human GABARAP like 

2 (Genbank Accession Number NM_007285; SEQ ID NO:1 1) 

Figure 8G shows the amino acid sequence of human GABARAP like 2 

(Genbank Accession Number NP_009216; SEQ ID NO: 12) 

Figure 8H shows the nucleotide sequence encoding human GABARAP like 

3 (Genbank Accession Number NM_032568; SEQ ID NO: 13) 

Figure 81 shows the amino acid sequence of human GABARAP like 3 
(Genbank Accession Number NP_1 15957; SEQ ID NO: 14) 

Figure 9 shows a comparison (Clustal W (1 .82) protein sequence alignment 
analysis) of human and Drosophila GABARAP proteins. Gaps in the 
alignment are represented as -. In the figure 'GABARAP-I3 Hs' refers to 
human GABARAP like 3, 'GABARAP-11 Hs' refers to human GABARAP like 
1 , 'CG1 534 Dm' refers to Drosophila protein encoded by Gadfly Accession 
Number CG1534, 'CG12334 Dm' refers to Drosophila protein encoded by 
Gadfly Accession Number CG12334, and 'GABARAP-I2 Hs' refers to the 
human GABARAP like 2. 
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Figure 10 shows the analysis of GABARAP 2 expression in mammalian 
tissues. The relative RNA-expression is shown on the Y-axis. In Figure 10A 
and 10B the tissues tested are given on the X-axis. WAT refers to white 
adipose tissue, BAT refers to brown adipose tissue. In Figure 10C, the 
X-axis represents the time axis. 'd0' refers to day 0 (start of the 
experiment), 'd2' - 'd10' refers to day 2 - day 10 of adipocyte 
differentiation). 

Figure 1 OA shows the real-time PGR analysis of GABARAP 2 expression in 
mouse wildtype tissues. 

Figure 1 0B shows the real-time PGR analysis of GABARAP 2 expression in 
wildtype mice (WT-mice), compared to genetically obese mice (ob/ob-mice) 
and to fasted mice (fasted-mice). 

Figure 1 0C shows the real-time PGR analysis GABARAP 2 expression in 
mammalian fibroblast (3T3-F442A) cells, during the differentiation from 
preadipocytes to mature adipocytes. 

Figure 11 shows the increase of triglyceride content of EP(3)3271 flies 
caused by homozygous viable integration of the P-vector into the first exon 
of Gadfly Accession Number CG 10576 (in comparison to controls with 
integration of these vectors). 

Figure 1 2 shows the molecular organisation of the mutated methionyl 
aminopeptidase (Gadfly Accession Number CG 10576) gene locus. 

Figure 13 shows the human homologs of Gadfly Accession Number 
CG10576 

Figure 1 3A. shows the BLASTP search result for Gadfly Accession Number 
CG 10576 (Query) with the best human homolog match (Sbject) 
Figure 13B shows the nucleotide sequence encoding human proliferation 
associated protein 2G4, 38 kDa (Genbank Accession Number NM_0061 91 ; 
SEQ ID NO:15) 
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Figure 13C shows the amino acid sequence of human proliferation 
associated protein 2G4, 38 kDa (Genbank Accession Number NPJ306182; 
SEQ ID NO:16). 

Figure 14 shows the ClustalW (1.7) protein alignment for the Drosophila 
protein encoded by GadFly Accession Number CG 10576 and human 
p38-2G4 (referred to as ' XP_049048.1 ')■ 

Figure 1 5 shows the analysis of proliferation associated 2G4 protein, 38 
kDa (PA2G4) expression in mammalian tissues. The relative 
RNA-expression is shown on the Y-axis, the tissues tested are given on the 
X-axis. WAT refers to white adipose tissue, BAT refers to brown adipose 
tissue. 

Figure 15A shows the real-time PCFt analysis of PA2G4 expression in 
mouse wildtype tissues. 

Figure 15B shows the real-time PCR analysis of PA2G4 expression in 
wildtype mice (WT-mice), compared to genetically obese mice (ob/ob-mice) 
and to fasted mice (fasted-mice). 

Figure 16 shows the increase of triglyceride content of EP(3)3688 flies 
caused by homozygous viable integration of the P-vector (in comparison to 
controls without integration of this vector). 

Figure 1 7 shows the molecular organisation of the mutated Mocsl (Gadfly 
Accession Number CG7858) gene locus. 

Figure 18 shows the human homologs of Gadfly Accession Number 
CG7858 (Mocsl ) 

Figure 1 8A shows the BLASTP search results for Gadfly Accession Number 
CG7858 (Mocsl ) (referred to as 'Query'), shown are only the human 
homologs (referred to as 'Sbjct') with highest homology values. 
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Figure 18B shows the nucleotide sequence encoding human MOCSA and 
MOCSC (Genbank Accession Number AF034374; SEQ ID NO: 17) 
Figure 1 8C shows the amino acid sequence of human MOCSA (Genbank 
Accession Number AAB87523; SEQ ID NO:18) 

Figure 18D shows the nucleotide sequence encoding human MOCS1 
protein, isoform 1 (Genbank Accession Number XM_1 66358; SEQ ID 
NO:19) 

Figure 18E shows the amino acid sequence of human MOCS1 protein, 
isoform 1 (Genbank Accession Number XP_1 66358; SEQ ID NO:20) 
Figure 18F shows the nucleotide sequence encoding human MOCS1, 
isoform 2 (Genbank Accession Number NMJD05942; SEQ ID NO:21) 
Figure 1 8G shows the amino acid sequence of human MOCS1, isoform 2 
(Genbank Accession Number NP_005933; SEQ ID NO:22) 
Figure 18H shows the nucleotide sequence encoding human MOCS1, 
isoform 3 (Genbank Accession Number NMJ 38928; SEQ ID NO:23) 
Figure 181 shows the amino acid sequence of human MOCS1, isoform 3 
(Genbank Accession Number NP_620306; SEQ ID NO:24). 

Figure 19 shows the comparison (Clustal W (1.82) protein sequence 
alignment analysis) of human and Drosophila Mocsl proteins. Gaps in the 
alignment are represented as In the figure 'Mocsl -2 Hs' refers to human 
Mocsl, isoform 2, 'Mocs1-3 Hs' refers to human Mocsl, isoform 3, 
'Mocsl -1 Hs 7 refers to human Mocsl, isoform 1, 'Mocsl Hs' refers to 
human MocsA, 'Mocsl -PA Dm' refers to Drosophila Mocsl protein variant 
A, and 'Mocsl -PC Dm' refers to Drosophila Mocsl protein variant C. 

Figure 20 shows the analysis of Mocs expression in mammalian tissues. 
The relative RNA-expression is shown on the Y-axis, the tissues tested are 
given on the X-axis. WAT refers to white adipose tissue, BAT refers to 
brown adipose tissue. 

Figure 20A shows the real-time PGR analysis of Mocs expression in mouse 
wildtype tissues. 
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Figure 20B shows the real-time PCR analysis of Mocs expression in 
wildtype mice (WT-mice), compared to genetically obese mice (ob/ob-mice) 
and to fasted mice (fasted-mice). 

Figure 21 shows the triglyceride content of a peanut protein (pnut; Gadfly 
Accession Number CG8705) mutant. Shown is the increase of triglyceride 
content of EP(2)2036 flies caused by homozygous viable integration of the 
P-vector into the promoter/enhancer of peanut (in comparison to controls 
-EP control- with integration of these vectors). 

Figure 22 shows the molecular organisation of the mutated pnut (Gadfly 
Accession Number CG8705) gene locus. 

Figure 23 shows the human homologs of Gadfly Accession Number 
CG8705 (peanut) 

Figure 23A shows the BLASTP search results for Gadfly Accession Number 
CG8705 

Figure 23B shows the nucleotide sequence encoding human CDC10 cell 
division cycle 10 homolog (Genbank Accession Number NM_001788; SEQ 
ID NO:25) 

Figure 23C shows the amino acid sequence of human CDC10 cell division 
cycle 130 homolog (Genbank Accession Number NP_001779; SEQ ID 
NO:26). 

Figure 24 shows the ClustalW (1 .7) protein sequence alignment for Gadfly 
Accession Number CG8705, human CDC10 ('XM_01 1 595'), and human 
CDC10 homolog (septin) ('NM_001 788') 

Figure 25 shows the analysis of the peanut homolog (referred to as 
'Peanut') expression in mammalian tissues. The relative RNA-expression is 
shown on the Y-axis, the tissues tested are given on the X-axis. WAT 
refers to white adipose tissue, BAT refers to brown adipose tissue. 
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Figure 25A shows the real-time PGR analysis of Peanut expression in 
mouse wildtype tissues. 

Figure 25B shows the real-time PCR analysis of Peanut expression in 
wildtype mice (WT-mice), compared to genetically obese mice (ob/ob-mice) 
and to fasted mice (fasted-mice). 

Figure 26 shows the triglyceride content of a pyruvate kinase protein 
(Gadfly Accession Number CG7069) mutant. Shown is the increase of 
triglyceride content of EP(3)3224 flies caused by homozygous viable 
integration of the P-vector into the second exon of the pyruvate kinase 
gene (in comparison to controls -EP control- with integration of these 
vectors). 

Figure 27 shows the molecular organization of the mutated pyruvate kinase 
(Gadfly Accession Number CG7069) gene locus. 

Figure 28 shows the human homologs of Gadfly Accession Number 
CG7069 

Figure 28A shows the BLASTP search result for Gadfly Accession Number 
CG7069 (Query) with the best human homologous match (Sbject). 
Figure 28B shows the nucleotide sequence encoding human pyruvate 
kinase, muscle (Genbank Accession Number X56494; SEQ ID NO:27) 
Figure 28C shows the amino acid sequence of human pyruvate kinase, 
muscle, M1 isozyme (Genbank Accession Number P1 461 8; SEQ ID NO:28) 
Figure 28D shows the amino acid sequence of human pyruvate kinase, 
muscle, M2 isozyme (Genbank Accession Number P1 4786; SEQ ID NO:29) 
Figure 28E shows the nucleotide sequence encoding human pyruvate 
kinase, liver and RBC (Genbank Accession Number NMJD00298; SEQ ID 
NO:30) 

Figure 28F shows the amino acid sequence of human pyruvate kinase, liver 
and RBC (Genbank Accession Number NP 000289; SEQ ID NO:31). 



WO 03/066086 



PCT/EP03/01094 



- 62 - 

Figure 29 shows the ClustalW (1 .7) protein sequence alignment analysis of 
Drosophila, mouse, and human pyruvate kinase. In the figure 'pk3Ji2' 
re f ers to human pyruvate kinase, muscle (Genbank Accession Number. 
NP_002645), 'pk3_h' refers to human pyruvate kinase, muscle (Genbank 
Accession Number XMJ337768), 'pk3_m' refers to mouse pyruvate kinase 
3 (Genbank Accession Number BC016619), and 'pk3_dro' refers to 
Drosophila pyruvate kinase (Gadfly Accession Number CG7069). 

Figure 30 shows the increase of triglyceride content of EP(3)3321, 
EP(3)0834, and EP(3)0979 flies caused by homozygous viable integration 
of the P-vector in the transcription unit of Gadfly Accession Number 
CG9429 (in comparison to controls without integration of this vector). 

Figure 31 shows the molecular organisation of the calreticulin (Crc; Gadfly 
Accession Number CG9429) gene locus. 

Figure 32 shows the human homologs of Gadfly Accession Number 
CG9429 (calreticulin) 

Figure 32A shows the BLASTP search result for Gadfly Accession Number 

CG9429 (Query) with the best human homologous match (Sbject). 

Figure 32B shows the nucleotide sequence encoding human Calreticulin 

(Genbank Accession Number NM_004343; SEQ ID NO:32) 

Figure 32C shows the amino acid sequence of human Calreticulin 

(Genbank Accession Number NPJ304334; SEQ ID NO:33) 

Figure 32D shows the nucleotide sequence encoding human Calreticulin 2 

(Genbank Accession Number NMJ 45046; SEQ ID NO:34) 

Figure 32E shows the amino acid sequence of human Calreticulin 2 

(Genbank Accession Number NP_659483; SEQ ID NO:35). 

Figure 33 shows the comparison (Clustal W (1.82) protein sequence 
alignment analysis) of human and Drosophila calreticulin proteins. Gaps in 
the alignment are represented as In the figure 'crc Dm' refers to 
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Drosophila calreticulin, 'crc Hs' refers to human calreticulin, and 
'MGC26577 Hs' refers to human calreticulin 2. 



The examples illustrate the invention: 



Example 1 : Measurement of triglyceride content 

Mutant flies are obtained from a fly mutation stock collection. The flies are 
grown under standard conditions known to those skilled in the art. In the 
course of the experiment, additional feedings with bakers yeast 
(Saccharomyces cerevisiae) are provided. The average increase of 
triglyceride content of Drosophila containing the EP-vectors in homozygous 
or hemizygous viable integration was investigated in comparison to control 
flies (see FIGURES 1, 6, 11, 16, 21, 26, and 30). For determination of 
triglyceride, flies were incubated for 5 min at preferably 90°C in an 
aqueous buffer using a waterbath, followed by hot extraction. After 
another 5 min incubation at preferably 90°C and mild centrifugation, the 
triglyceride content of the flies extract was determined using Sigma 
Triglyceride (INT 336-1 0 or -20) assay by measuring changes in the optical 
density according to the manufacturer's protocol. As a reference protein 
content of the same extract was measured using BIO-RAD DC Protein 
Assay according to the manufacturer's protocol. The assay was repeated 
several times. 

The average triglyceride level of all flies of the EP collections (referred to as 
'EP-control') is shown as 100% (ratio triglyceride content/protein content) 
in the first columns in FIGURES 1, 11, 16, 21, 26, and 30, including 
standard deviation. The average triglyceride level of all flies of the PX 
collection (referred to as 'PX-lines') is shown as 1 (relative amount of 
triglyceride/fly) in the first column in FIGURE 6, including standard 
deviation. The average triglyceride level of all flies containing the FB- Gal4 
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vector (referred to as 'random EP/FB') is shown as 100% (ratio triglyceride 
content/protein content) in the third column in FIGURE 1 . The average 
triglyceride level of all flies containing the elav- Gal4 vector (referred to as 
'random EP/elav') is shown as 100% in the fifth column in FIGURE 1. 

HD-EP(3)37409 homozygous flies show constantly a higher triglyceride 
content than the controls (142 %; column 2 in FIGURE 1). The offspring of 
HD-EP(3)37409 males that are crossed to FB-Gal4 virgins, carrying a copy 
of the HD-EP(3)37409 vector and a copy of the FB-Gal4 vector, leading to 
ectopic expression of adjacent genomic DNA sequences 3' of the 
HD-EP(3)37409 integration locus, mainly in the fatbody of these flies, 
show no changes in triglyceride content compared with the controls (103 
%, column 4 in Figure 1). The offspring of HD-EP(3)37409 males that are 
crossed to elav-Gal4 virgins, carrying a copy of the HD-EP(3)37409 vector 
and a copy of the elav-Gal4 vector, leading to ectopic expression of 
adjacent genomic DNA sequences 3' of the HD-EP(3)37409 integration 
locus, mainly in the neurons of these flies, show constantly a lower 
triglyceride content than the controls (70%; colunm 6 in Figure 1). 
Therefore, the loss of the gene activity and the gain of gene activity in the 
locus 98B17-19 on chromosome 3R where the EP-vector of 
HD-EP(3)37409 flies is homozygous viable integrated 5' of the giigamesh 
gene, are in both cases responsible for changes in the metabolism of the 
energy storage triglycerides. 

PX6298.1 hemizygous flies show constantly a lower triglyceride content 
than the controls (column 2 in FIGURE 6). Therefore, the change of gene 
activity in the locus of the PX6298.1 integration on chromosome X where 
the PX-vector of PX6298.1 flies is hemizygous viable integrated, is 
responsible for changes in the metabolism of the energy storage 
triglycerides. 
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EP(3)3271 homozygous flies show constantly a higher triglyceride content 
than the controls (column 2 in FIGURE 11). Therefore, the loss of gene 
activity in the locus 64F1 on chromosome 3L where the EP-vector of HD- 
EP(3)3271 flies is homozygous viable integrated, is responsible for changes 
in the metabolism of the energy storage triglycerides. 

EP(3)3688 homozygous flies show constantly a higher triglyceride content 
than the controls (column 2 in FIGURE 16). Therefore, the loss of gene 
activity in the locus 68A3-68A3 on chromosome 3L where the EP-vector 
of EP(3)3688 flies is homozygous viable integrated, is responsible for 
changes in the metabolism of the energy storage triglycerides. 

EP(2)2036 homozygous flies show constantly a higher triglyceride content 
than the controls (column 2 in FIGURE 21). Therefore, the loss of gene 
activity in the locus 44B3-44B4 on chromosome 2L where the EP-vector of 
EP(2)2036 flies is homozygous viable integrated, is responsible for changes 
in the metabolism of the energy storage triglycerides. 

EP(3)3224 homozygous flies show constantly a higher triglyceride content 
than the controls (153%; column 2 in FIGURE 26). Therefore, the loss of 
gene activity in the locus 94A15-16 on chromosome 3R where the 
EP-vector of EP(3)3224 flies is homozygous viable integrated, is 
responsible for changes in the metabolism of the energy storage 
triglycerides. 

EP(3)3321 , EP(3)0834, and EP(3)0979 homozygous flies show constantly 
a higher triglyceride content than the controls (columns 2 to 4 in FIGURE 
30). Therefore, the loss of gene activity in the locus 85E2 on chromosome 
3R where the EP-vectors of EP(3)3321 , EP(3)0979, or EP(3)0834 flies are 
homozygous viable integrated, is responsible for changes in the metabolism 
of the energy storage triglycerides. 
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Example 2: Identification of Drosophila genes and proteins associated with 
metabolic control 

Nucleic acids encoding the proteins of the present invention were identified 
using a plasmid-rescue technique. Genomic DNA sequences were isolated 
that are localized to the EP vector (herein HD-EP(3)37409, PX6298.1, 
EP<3)3271, EP(3)3688, EP(2)2036, EP(3)3224, EP(3)3321 / EP(3)0834, 
and EP(3)0979) integration. Using those isolated genomic sequences public 
databases like Berkeley Drosophila Genome Project (GadFly) were screened 
thereby identifying the integration sites of the vectors, and the 
corresponding genes. The molecular organization of these gene loci is 
shown in FIGURES 2, 7, 12, 17, 22, 27, and 31. 

In FIGURE 2, genomic DNA sequence is represented by the assembly as a 
thin black line in the middle (numbers represent the length in basepairs of 
the genomic DNA) that includes the integration sites of vector for line 
HD-EP(3)37409. Transcribed DNA sequences (ESTs) and predicted exons 
are shown as bars on the two sides (sense and antisense strand). Predicted 
exons of the cDNA with GadFly Accession Number CG6963 (referred to as 
gilgamesh or gish) are shown as dark grey bars and introns as light grey 
lines. The sequence encodes for a gene that is predicted by GadFly 
sequence analysis programs as Accession Number CG6963. Public DNA 
sequence databases (for example, NCBI GenBank) were screened thereby 
identifying the integration sites of lines HD-EP(3)37409, causing an 
increase of triglyceride content. HD-EP(3)37409 is integrated into the 
promoter region of the second transcription unit in sense orientation of the 
cDNA with GadFly Accession Number CG6963 (the site of integration is 
shown as vertical dotted line). Therefore, expression of the cDN A encoding 
gilgamesh could be effected by integration of vectors of line 
HD-EP(3)37409, or gilgamesh could be ectopically expressed, e.g. in 
neurons, leading to a change of the energy storage triglycerides. 
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In FIGURE 7, genomic DNA sequence is represented by the assembly as a 
thin black line in the middle (numbers represent the length in basepairs of 
the genomic DNA) that includes the integration sites of vector for line 
PX6298.1. Transcribed DNA sequences (ESTs) and predicted exons are 
shown as bars on the two sides (sense and antisense strand). Predicted 
exons of the cDNA with GadFly Accession Number CG1534 are shown as 
dark grey bars and introns as light grey lines. The sequence encodes for a 
gene that is predicted by GadFly sequence analysis programs as Accession 
Number CG1 534. The integration site of the vector for line PX6298.1 was 
identified at position 215791 on Drosophila chromosome X. Predicted 
exons of the cDNA with GadFly Accession Number CG1 534 are located on 
chromosome X in three positions, starting with ATG start codons at 
positions 209052 (two transcripts), and 215668. Only the transcript with 
the start codon at position 215668 encodes for GABARAP (Gadfly 
Accession Number CT3947; synonym for Gadfly Accession Number 
CG32672 and Genbank Accession Number. NIVM 67245). Using those 
isolated genomic sequences public databases like Berkeley Drosophila 
Genome Project (GadFly) were screened confirming the hemizygous viable 
integration site of the PX6298.1 vector in the 5prime untranslated region 
of the first exon of the gene encoding GABARAP ( 1 23 base pairs 5prime of 
the start codon), causing a decrease of triglyceride content. 

In FIGURE 12, genomic DNA sequence is represented by the assembly as 
a dotted black line (from position 5703500 to 5707500 on chromosome 
3L) that includes the integration sites of vector for line EP(3)3271. 
Transcribed DNA sequences (ESTs) and predicted exons are shown as bars 
in the lower two lines. Predicted exons of the cDNA with GadFly 
Accession Number CG10576 are shown as dark grey bars and introns as 
light grey bars. Methionyl aminopeptidase encodes for a gene that is 
predicted by GadFly sequence analysis programs as Accession Number 
CG 10576. Public DNA sequence databases (for example, NCBI GenBank) 
were screened thereby identifying the integration sites of lines EP(3)3271 , 
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causing an increase of triglyceride content. EP(3)3271 is integrated into 
the first exon in antisense orientation of the cDNA with Accession Number 
CG10576. Therefore, expression of the cDNA encoding Accession Number 
CG 10576 could be effected by homozygous integration of vectors of line 
EP(3)3271, leading to increase of the energy storage triglycerides. 

In FIGURE 17, genomic DNA sequence is represented by the assembly as 
a dotted black line (from position 10988000 to 10992000 on chromosome 
3D that includes the integration sites of vector for line EP(3)3688. 
Transcribed DNA sequences (ESTs) and predicted exons are shown as bars 
in the lower two lines. Predicted exons of the cDNA with GadFly 
Accession Number CG7858 are shown as dark grey bars and introns as 
light grey bars. Mocsl encodes for a gene that is predicted by GadFly 
sequence analysis programs as Accession Number CG7858. Public DNA 
sequence databases (for example, NCBI GenBank) were screened thereby 
identifying the integration sites of lines EP(3)3688, causing an increase of 
triglyceride content. EP(3)3688 is integrated into the promoter in sense 
direction of the cDNA with Accession Number CG7858. Therefore, 
expression of the cDNA encoding Accession Number CG7858 could be 
effected by homozygous integration of vectors of line EP(3)3688, leading 
to increase of the energy storage triglycerides. 

In FIGURE 22, genomic DNA sequence is represented by the assembly as 
a dotted black line (from position 3272156 to 3277156 on chromosome 
2R) that includes the integration sites of vector for line EP(2)2036. 
Transcribed DNA sequences (ESTs) and predicted exons are shown as bars 
in the lower two lines. Predicted exons of the cDNA with GadFly 
Accession Number CG8705 are shown as dark grey bars and introns as 
light grey bars. Pnut encodes for a gene that is predicted by GadFly 
sequence analysis programs as Accession Number CG8705. Public DNA 
sequence databases (for example, NCBI GenBank) were screened thereby 
identifying the integration sites of lines EP(2)2036, causing an increase of 
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triglyceride content. EP(2)2036 is integrated into the promoter/enhancer of 
peanut in antisense orientation of the cDNA with Accession Number 
CG8705. Therefore, expression of the cDNA encoding Accession Number 
CG8705 could be effected by homozygous integration of vectors of line 
EP(2)2036, leading to increase of the energy storage triglycerides. 

In FIGURE 27 7 genomic DNA sequence is represented by the assembly as 
a dotted black line (from position 1811 3034 to 1 81 1 61 59 on chromosome 
3R) that includes the integration sites of vector for line EP{3)3224. 
Transcribed DNA sequences (ESTs) and predicted exons are shown as bars 
in the upper two lines. Predicted exons of the cDNA with GadFly 
Accession Number CG7069 are shown as dark grey bars and introns as 
light grey bars. The sequence encodes for a gene that is predicted by 
GadFly sequence analysis programs as Accession Number CG7069. Public 
DNA sequence databases (for example, NCBI GenBank) were screened 
thereby identifying the integration sites of lines EP(3)3224, causing an 
increase of triglyceride content. EP(3)3224 is integrated into the second 
exon of pyruvate kinase in sense orientation of the cDNA with GadFly 
Accession Number CG7069. Therefore, expression of the cDNA encoding 
GadFly Accession Number CG7069 could be effected by homozygous 
integration of vectors of line EP(3)3224, leading to an increase of the 
energy storage triglycerides. 

In FIGURE 31, genomic DNA sequence is represented by the assembly as 
a dotted black line (from position 5435825 to 5438950 on chromosome 
3R) that includes the integration sites of vectors for lines EP(3)3321, 
EP(3)0979, and EP(3)0834. Transcribed DNA sequences (ESTs) and 
predicted exons are shown as bars in the lower two lines. Predicted exons 
of the cDNA with GadFly Accession Number CG9429 are shown as dark 
grey bars and introns as light grey bars, calreticulin encodes for a gene that 
is predicted by GadFly sequence analysis programs as Accession Number 
CG9429. Public DNA sequence databases (for example, NCBI GenBank) 
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were screened thereby identifying the integration sites of lines EP(3)3321 , 
EP(3)0979, and EP(3)0834, causing an increase of triglyceride content. 
EP(3)3321 , EP(3)0979, and EP(3)0834 are integrated into the transcription 
unit of the cDNA with Accession Number CG9429. Therefore, expression 
of the cDNA encoding Accession Number CG9429 could be effected by 
homozygous integration of vectors of line EP(3)3321, EP(3)0979, and 
EP(3)0834, leading to increase of the energy storage triglycerides. 

Example 3: Identification of human homologous genes and proteins 

The Drosophila genes and proteins encoded thereby with functions in the 
regulation of triglyceride metabolism were further analysed using the 
BLAST algorithm searching in publicly available sequence databases and 
mammalian homologs were identified (see FIGURES 3, 4, 8, 9, 13, 14, 18, 
19, 23, 24, 28, 29, 32, and 33). 

As shown in FIGURE 3A, the gene product of Drosophila gilgamesh (gish; 
Gadfly Accession Number CG6963; Genbank Accession Number 
NM_080202) is 83% homologous over 426 amino acids (of 447 amino 
acids) to a human casein kinase 1 (also referred to as casein kinase 1, 
gamma 3) (GenBank Accession Number XM_049422 for the cDNA, 
XPJ349422 for the protein). The gene product of Drosophila CG6963 is 
80% homologous over 426 amino acids (of 459 amino acids) to human 
sequence 4 from patent WO01 64905 (GenBank Accession Number 
AX239864). Casein kinase 1 homologous proteins and nucleic acid 
molecules coding therefore are obtainable from insect or vertebrate 
species, e.g. mammals or birds. Particularly preferred are human casein 
kinase 1 homologous nucleic acids and polypeptides encoded thereby, 
particularly encoding (i) human casein kinase 1, gamma 1 (Genbank 
Accession Numbers NMJD22048, AB042563 for the cDNA, NP_071331 
for the protein, Swiss Prot. Accession Number Q9HCP0 for the protein; see 
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Figure 3B and 3C; SEQ ID NO: 1 and 2), (ii) human casein kinase 1, 
gamma 2 (Genbank Accession Number NMJD01319 for the cDNA, 
NP_001310 for the protein; see Figure 3D and 3E, SEQ ID NO: 3 and 4), or 
(Hi) human casein kinase 1, gamma 3 (Genbank Accession Number 
NM_004384 for the cDNA, NP_003475 for the protein, formerly Genbank 
Accession Number XM_049422; see Figure 3F and 3G; SEQ ID NO: 5 and 
6). An alignment of the casein kinases 1 from different species has been 
done by the Clustal W program and is illustrated in Figure 4. 

As shown in FIGURE 8A, the gene product of Drosophila CG1534 (also 
referred to as Gadfly Accession Number CG32672) is 96% homologous 
over 1 1 3 amino acids to human GABARAP (Genbank Accession Number 
NP 009209.1), and to mouse GABARAP (Genbank Accession Number 
NP 062723. 1 ) . GABARAP homologous proteins and nucleic acid molecules 
coding therefore are obtainable from insect or vertebrate species, e.g. 
mammals or birds. Particularly preferred are human GABARAP homologous 
nucleic acids and polypeptides encoded thereby, particularly encoding (i) 
human GABARAP (Genbank Accession Number NM_007278 for the cDNA, 
NP 009209 for the protein; see Figure 8B and 8C; SEQ ID NO: 7 and 8), 
(ii) human GABARAP like 1 (Genbank Accession Number NM_031412 for 
the cCNA, NP_1 13600 for the protein; see Figure 8D and 8E; SEQ ID NO: 
9 and 10), (iii) human GABARAP like 2 (Genbank Accession Number 
NMJD07285 for the cDNA, NP_009216 for the protein; see Figure 8F and 
8G; SEQ ID NO: 11 and 12), or (iv) human GABARAP like 3 (Genbank 
Accession Number NM_032568 for the cDNA, NP_1 1 5957 for the protein; 
see Figure 8H and 81; SEQ ID NO: 13 and 14). An alignment of GABARAP 
and GABARAP like proteins from different species has been done by the 
Clustal W program and is illustrated in Figure 9. 

As shown in FIGURE 13A, the gene product of Drosophila CG10576 is 
70% homologous over 276 amino acids (of 386 amino acids) to human 
proliferation-associated 2G4, 38kD (also referred to as PA2G4, HG4-1 , and 
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cell cycle protein; GenBank Accession Number XM_049048; Lamartine et 
al., 1997, Cytogenet. Cell Genet. 78:31-35), which is identical to 
sequence 5 from patent. US 5,871,973 (sequence 5, GenBank Accession 
Number AAE06380.1). The gene product of Drosophiia CG10576 is 70% 
homologous over 276 amino acids (of 386 amino acids) to mouse 
proliferation-associated 2G4, 38kD (GenBank Accession Number 
NM_01 1 1 1 9), which is identical to sequence 10 from patent US 
5,871,973 (GenBank Accession Number AAE06384.1). PA2G4 
homologous proteins and nucleic acid molecules coding therefore are 
obtainable from insect or vertebrate species, e.g. mammals or birds. 
Particularly preferred are human PA2G4 homologous nucleic acids and 
polypeptides encoded thereby, particularly encoding human 
proliferation-associated 2G4 protein (Genbank Accession Number 
NM_0061 91 for the cDNA, NP_0061 82 for the protein, formerly Genbank 
Accession Number XM_049048; see Figure 13B and 13C; SEQ ID NO: 15 
and 1 6). An alignment of PA2G4, 38 kDa homologs from different species 
has been done by the Clustal W program and is illustrated in Figure 14. 

As shown in FIGURE 18A, the gene product of Drosophiia Mocsl (Gadfly 
Accession Number CG7858) is 77% homologous over 351 amino acids (of 
385 amino acids) to human molybdenum cofactor biosynthesis protein A 
(also referred to as MOCSA; GenBank Accession Number AAB87523), and 
77% homologous over 348 amino acids (of 385 amino acids) to human 
molybdenum cofactor synthesis 1 (also referred to as MOCS1; GenBank 
Accession Number XP_046687). Mocsl homologous proteins and nucleic 
acid molecules coding therefore are obtainable from insect or vertebrate 
species, e.g. mammals or birds. Particularly preferred are human Mocsl 
homologous nucleic acids and polypeptides encoded thereby, particularly 
encoding the human MOCS1 isoforms, for example (i) a human 
molybdenum cofactor biosynthesis protein A or a molybdenum cofactor 
biosynthesis protein C (also referred to as MOCSA or MOCSC; Genbank 
Accession Number AF034374 for the cDNA, AAB87523 for the protein; 
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see Figure 1 8B and 1 8C; SEQ ID NO: 1 7 and 1 8), (ii) human molybdenum 
cofactor synthesis 1 protein isoforms (MOCS1; Genbank Accession 
Numbers XMJD46687, XMJ 66358, NMJ305943, NM_005942, 
NM J 38928 for the cDNAs, XP_046687, XPJ 66358, NP_005934 / 
NP_005933, NP_620306 for the proteins; see Figures 18D, 18E, 18F, 
18G, 18H, and 181; SEQ ID NO: 19, 20, 21, 22, 23, and 24). An 
alignment of Mocsl homologs from different species has been done by the 
Clustal W program and is illustrated in Figure 19. 

As shown in FIGURE 23A, gene product of Drosophila peanut (pnut; Gadfly 
Accession Number CG8705) is 78% homologous over 331 amino acids (of 
418 amino acids) to human cell division cycle 10 homolog (GenBank 
Accession Number NMJ301788; Nakatsuru et al., 1994, BBR comm. 
202:82-87), and 77% homologous over 302 amino acids (of 384 amino 
acids) to human CDC10 protein homolog, similar to septin 7 (GenBank 
Accession Number XMJ31 1 595). The gene product of Drosophila peanut 
is 78% homologous over 330 amino acids (of 417 amino acids) to mouse 
septin 7 (cell division cycle 10 homolog) (GenBank Accession Number 
AJ223782), and 78% homologous over 331 amino acids (of 419 amino 
acids) to Candida albicans septin 7 protein (GenBank Accession Number 
AAE20750.1, sequence 5 from patent US 5,849,556 and US 5,952,214). 
Peanut homologous proteins and nucleic acid molecules coding therefore 
are obtainable from insect or vertebrate species, e.g. mammals or birds. 
Particularly preferred are human peanut homologous nucleic acids and 
polypeptides encoded thereby, particularly encoding human cell division 
cycle 10 protein (CDC10; Genbank Accession Number XIVM 65879, 
NM_001788 for the cDNA, XPJ 65879, NPJD01779 for the protein; 
formerly Genbank Accession Number XMJ31 1595); see Figures 23B and 
23C; SEQ ID NO: 25 and 26). An alignment of CDC10 homologs from 
different species has been done by the Clustal W program and is illustrated 
in Figure 24. 
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As shown in FIGURE 28A, gene product of Drosophila GadFly Accession 
Number CG7069 is 68% homologous over 41 2 amino acids (of 531 amino 
acids) to human pyruvate kinase, muscle (GenBank Accession Number 
XM_037768). The gene product of GadFly Accession Number CG7069 is 
68% homologous over 416 amino acids (of 531 amino acids) to mouse 
pyruvate kinase 3 (GenBank Accession Number BC016619). Pyruvate 
kinase homologous proteins and nucleic acid molecules coding therefore 
are obtainable from insect or vertebrate species, e.g. mammals or birds. 
Particularly preferred are human pyruvate kinase homologous nucleic acids 
and polypeptides encoded thereby, particularly encoding (i) human human 
pyruvate kinase, muscle (also referred to as PKM1 and PKM2; Genbank 
Accession Number X56494 for the cDNA, P14618 and P14786 for the 
proteins; formerly Genbank Accession Number XMJ337768; see Figure 
28B, 28C, and 28D; SEQ ID NO: 27, 28, and 29), or (ii) human pyruvate 
kinase, liver and RBC (Genbank Accession Number NM_000298 for the 
cDNA, NPJD00289 for the protein; see Figure 28E and 28F; SEQ ID NO: 
30 and 31); see Figures 23B and 23C; SEQ ID NO: 25 and 26). An 
alignment of pyruvate kinase homologs from different species has been 
done by the Clustal W program and is illustrated in Figure 29. 

As shown in FIGURE 32A, gene product of Drosophila calreticulin (Crc; 
Gadfly Accession Number CG9429) is 77% homologous over 404 amino 
acids (of 417 amino acids) to human calreticulin precursor (GenBank 
Accession Number NPJD04334). Calreticulin homologous proteins and 
nucleic acid molecules coding therefore are obtainable from insect or 
vertebrate species, e.g. mammals or birds. Particularly preferred are human 
calreticulin homologous nucleic acids and polypeptides encoded thereby, 
particularly encoding (i) human calreticulin (Genbank Accession Numbers 
NM__004343, M84739 for the cDNA, NPJ304334 for the protein; see 
Figure 32B and 32C; SEQ ID NO: 32 and 33), or (ii) human calreticulin 2 
(hypothetical protein MGC26577; Genbank Accession Number 
NM 145046 for the cDNA, NP_659483 for the protein; see Figure 32D 
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and 32E; SEQ ID NO: 34 and 35). An alignment of calreticulin homologs 
from different species has been done by the Clustal W program and is 
illustrated in Figure 33. 

Example 4: dUCPy modifier screen 

Expression of Drosophila uncoupling protein dUCPy in a non-vital organ like 
the eye (Gal4 under control of the eye-specific promoter of the "eyeless" 
gene ) results in flies with visibly damaged eyes. This easily visible eye 
phenotype is the basis of a genetic screen for gene products that can 
modify UCP activity. 

Parts of the genomes of the strain with GaI4 expression in the eye and the 
strain carrying the pUAST-dUCPy construct were combined on one 
chromosome using genomic recombination. The resulting fly strain has 
eyes that are permanently damaged by dUCPy expression. Flies of this 
strain were crossed with flies of a large collection of mutagenized fly 
strains. In this mutant collection a special expression system (EP-element, 
Ref.: R0rth P, Proc Natl Acad Sci U S A 1996, 93(22):1 241 8-22) is 
integrated randomly in different genomic loci. The yeast transcription factor 
Gal4 can bind to the EP-element and activate the transcription of 
endogenous genes close the integration site of the EP-element. The 
activation of the genes therefore occurs in the same cells (eye) that 
overexpress dUCPy. Since the mutant collection contains several thousand 
strains with different integration sites of the EP-element it is possible to 
test a large number of genes whether their expression interacts with 
dUCPy activity. In case a gene acts as an enhancer of UCP activity the eye 
defect will be worsened; a suppressor will ameliorate the defect. 

Using this screen a gene with suppressing activity was discovered that 
was found to be the calreticulin gene in Drosophila. 
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Example 5: Genetic adipose pathway screen 

Adipose (adp) is a protein that has been described as regulating, causing or 
contributing to obesity in an animal or human (see WO 01/96371). 
Transgenic flies containing a wild type copy of the adipose cDNA under the 
control of the Gal4/UAS system were generated (Brand and Perrimon, 
1993, Development 1 18:401-415; for adipose cDNA, see WO 01/96371). 
Chromosomal recombination of these transgenic flies with an eyeless-Gal4 
driver line has been used to generate a stable recombinant fly line 
over-expressing adipose in the developing Drosophila eye. Animals 
receiving transgenic adipose activity under these conditions developed into 
adult flies with a visible change of eye phenotype. Virgins of the 
recombinant driver line were crossed with males of the mutant EP-line 
collection in single crosses and kept for preferably 1 2 to 15 days at 29°C. 
The offspring was checked for modifications of the eye phenotype 
(enhancement or suppression). Mutations changing the eye phenotype 
affect genes that modify adipose activity. The inventors have found that 
the fly line HD-EP(3)37409 is an enhancer of the eye-adp-Ga!4 induced 
phenotype. This result is strongly suggesting an interaction of gilgamesh 
gene with adipose since the integration of HD-EP(3)37409 was found to be 
located at the gilgamesh locus. This is supporting the function of 
gilgamesh and homologous proteins in the regulation of the energy 
homeostasis. 



Example 6: Expression profiling experiments 
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To analyze the expression of the polypeptides disclosed in this invention in 
mammalian tissues, several mouse strains (preferrably mouse strains 
C57BI/6J, C57BI/6 ob/ob and C57BI/KS db/db which are standard model 
systems in obesity and diabetes research) were purchased from Harlan 
Winkelmann (33178 Borchen, Germany) and maintained under constant 
temperature (preferrably 22°C) / 40 per cent humidity and a light / dark 
cycle of preferrably 14/10 hours. The mice were fed a standard diet (for 
example, from ssniff Spezialitaten GmbH, order number ssniff M-Z 
V1 126-000). For the fasting experiment ("fasted-mice"), wild type mice 
were starved for 48 h without food, but only water supplied ad libitum 
(see, for example, Schnetzler et al. J Clin Invest 1993 Jul;92(1 ):272-80, 
Mizuno et al. Proc Natl Acad Sci U S A 1996 Apr 1 6;93(8):3434-8). 
Animals were sacrificed at an age of 6 to 8 weeks. The animal tissues 
were isolated according to standard procedures known to those skilled in 
the art, snap frozen in liquid nitrogen and stored at -80°C until needed. 

For analyzing the role of the proteins disclosed in this invention in the in 
vitro differentiation of different mammalian cell culture cells for the 
conversion of pre-adipocytes to adipocytes, mammalian fibroblast (3T3-L1 ) 
cells (e.g., Green & Kehinde, Cell 1: 113-116, 1974) were obtained from 
the American Tissue Culture Collection (ATCC, Hanassas, VA, USA; 
ATCC- CL 173). 3T3-L1 cells were maintained as fibroblasts and 
differentiated into adipocytes as described in the prior art (e.g., Qiu. et al., 
J. Biol. Chem. 276:11988-95, 2001; Slieker et al., BBRC 251: 225-9, 
1998). At various time points of the differentiation procedure, beginning 
with day 0 (day of confluence) and day 2 (hormone addition; for example, 
dexamethasone and 3-isobutyl-1 -methylxanthine), up to 10 days of 
differentiation, suitable aliquots of cells were taken every two days. 
Alternatively, mammalian fibroblast 3T3-F442A cells (e.g., Green & 
Kehinde, Cell 7: 105-1 13, 1976) were obtained from the Harvard Medical 
School, Department of Cell Biology (Boston, MA, USA). 3T3-F442A cells 
were maintained as fibroblasts and differentiated into adipocytes as 
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described previously (Djian, P. et al., J. Cell. Physiol., 124:554-556, 
1985). At various time points of the differentiation procedure, beginning 
with day 0 (day of confluence and hormone addition, for example, insulin), 
up to 10 days of differentiation, suitable aliquots of cells were taken every 
two days. 3T3-F442A cells are differentiating in vitro already in the 
confluent stage after hormone (insulin) addition. 

RNA was isolated from mouse tissues or cell culture cells using Trizol 
Reagent (for example, from Invitrogen, Karlsruhe, Germany) and further 
purified with the RNeasy Kit (for example, from Qiagen, Germany) in 
combination with an DNase-treatment according to the instructions of the 
manufacturers and as known to those skilled in the art. Total RNA was 
reverse transcribed (preferrably using Superscript II RNaseH- Reverse 
Transcriptase, from Invitrogen, Karlsruhe, Germany) and subjected to 
Taqman analysis preferrably using the Taqman 2xPCR Master Mix (from 
Applied Biosystems, Weiterstadt, Germany; the Mix contains according to 
the Manufacturer for example AmpliTaq Gold DNA Polymerase, AmpErase 
UNG, dNTPs with dUTP, passive reference Rox and optimized buffer 
components) on a GeneAmp 5700 Sequence Detection System (from 
Applied Biosystems, Weiterstadt, Germany). 

Taqman analysis of casein kinase 1, gamma 1 (CK1G1) was performed 
preferrably using the following primer/probe pairs: mouse CK1G1 forward 
primer (Seq ID NO: 36) 5'- AAT GTC GAT GAC CCC ACT GG-3'; mouse 
CK1G1 reverse primer (Seq ID NO: 37) 5'- TCC ACT ACC TCC ACT TCG 
GC -3'; mouse CK1G1 Taqman probe (Seq ID NO: 38) (5/6-FAM) TCA 
CTC CAA TGC ACC AAT CAC AGC TCA (5/6-TAMRA). 

Taqman analysis of casein kinase 1, gamma 3 (CK1G3) was performed 
preferrably using the following primer/probe pairs: mouse CK1G3 forward 
primer (Seq ID NO: 39) 5' AAA TGG AGA GCT GAA CAC GGA -3'; mouse 
CK1G3 reverse primer (Seq ID NO: 40) 5'- TGT AGG AGC TGT AAT GGG 
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TGC A -3'; mouse CK1G3 Taqman probe (Seq ID NO: 41) (5/6-FAM) CCC 
CAC GGC AGG ACG GTC G (5/6-TAMRA). 

Taqman analysis of GABARAP 2 was performed preferrably using the 
following primer/probe pairs: mouse GABARAP 2 forward primer (Seq ID 
NO: 42) 5'- TCA GCC CAG GAA GAA CTT GTG-3'; mouse GABARAP 2 
reverse primer (Seq ID NO: 43) 5'- CAA GGC TGT GAT TCA TGT CGT C 
-3'; mouse GABARAP 2 Taqman probe (Seq ID NO: 44) (5/6-FAM) TGC 
ATT GGC TGT GAG AGC GGG AT (5/6-TAMRA). 

Taqman analysis of the proliferation associated protein 2G4, 38kDa 
(PA2G4) was performed preferrably using the following primer/probe pairs: 
mouse PA2G4 forward primer (Seq ID NO: 45) 5'- AGA CGA GCA GCA 
GGA GCA A -3'; mouse PA2G4 reverse primer (Seq ID NO: 46) 5'- TGT 
CGC CCC CCA TCT TAT AC -3'; mouse PA2G4 Taqman probe (Seq ID 
NO: 47) (5/6-FAM) ATC GCC GAG GAC CTG GTC GTG AC (5/6-TAMRA). 

Taqman analysis of Mocs was performed preferrably using the following 
primer/probe pairs: mouse Mocs forward primer (Seq ID NO: 48) 5'- CCT 
GAG CCA CGT GCA GGT -3'; mouse Mocs reverse primer (Seq ID NO: 49) 
5'- AGG ATG CCT GGA TCA ACA CAG -3'; mouse Mocs Taqman probe 
(Seq ID NO: 50) (5/6-FAM) CAC CTG GAG TTA GAC AGC ACA CGC CA 
(5/6-TAMRA). 

Taqman analysis of the peanut homologous protein (Peanut) was 
performed preferrably using the following primer/probe pairs: mouse Peanut 
forward primer (Seq ID NO: 51)5'- CGA GGA GAG GAG CGT CAA CT -3'; 
mouse Peanut reverse primer (Seq ID NO: 52) 5'- CCC ACA TAG CCC TCA 
AGG TTC -3'; mouse Peanut Taqman probe (Seq ID NO: 53) (5/6-FAM) 
CGG CAC CAT GGC TCA ACC GA (5/6-TAMRA). 
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Expression profiling studies confirm the particular relevance of casein 
kinase 1 gamma, GABARAP, PA2G4, MOCS1, and CDCIOas regulators of 
energy metabolism in mammals. The results are shown in FIGURES 5, 10, 
15, 20, and 25. casein kinase 1 gamma, GABARAP, PA2G4, MOCS1, and 
CDC10 show expression in many tissues. In addition, significant 
expression levels of casein kinase 1 gamma, GABARAP, PA2G4, MOCS1, 
and CDC10 were found in metabolic active tissues like white adipocyte 
tissue (WAT) and brown adipocyte tissue (BAT), (FIGURE 5A, 5C, 10A, 
15A, 20A, and 25A), confirming a role in the regulation of energy 
homeostasis and thermogenesis. 

Further, we show that casein kinase 1 gamma, GABARAP, PA2G4, 
MOCS1, and CDC10 are regulated by fasting and by genetically induced 
obesity, and that thus the expression of casein kinase 1 gamma, 
GABARAP, PA2G4, MOCS1 , and CDC10 is under metabolic control. In this 
invention, we used mouse models of insulin resistance and/or diabetes, 
such as mice carrying gene knockouts in the leptin pathway (for example, 
ob (leptin) or db (leptin receptor/ligand) mice) to study the expression of 
the protein of the invention. Such mice develop typical symptoms of 
diabetes, show hepatic lipid accumulation and frequently have increased 
plasma lipid levels (see Bruning et al, 1998, Mol. Cell. 2:449-569). 

The GABARAP protein was also examined in the in vitro differentiation 
models for the conversion of pre-adipocytes to adipocytes, as described 
above. 

As shown in Figure 5A and 5C, real time PGR (Taqman) analysis of the 
expression of the casein kinase 1, gamma 1 (CK1G1) and casein kinase 1, 
gamma 3 (CK1G3) RNA in mammalian (mouse) tissues revealed that 
CK1G1 and CK1 G3 are expressed in different mammalian tissues, including 
white adipose tissue (WAT), brown adipose tissue (BAT), hypothalamus, 
and brain. The high experession levels of CK1G1 and CK1G3 in these 
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tissues indicates, that CK1G1 and CK1G3 are involved in the metabolism 
of tissues relevant for the metabolic syndrome. The expression of CK1G1 
and CK1G3 are under metabolic control: In genetically obese (ob/ob) mice, 
expression of CK1G1 and CK1G3 are strongly induced in BAT (see Figure 
5B and 5D). Analysis of the expression of casein kinase 1 gamma 2 
(CK1G2) revealed that CK1G2 is expressed in different mammalian tissues 
with strongest expression in testis, and a 2.5 fold higher expression of 
CK1G2 in brown adipose tissue of genetically obese (ob/ob) mice, 
compared to wild type mice (data not shown). 

As shown in Figure 10A, real time PCR (Taqman) analysis of the 
expression of the GABARAP 2 RNA in mammalian (mouse) tissues revealed 
that GABARAP 2 is expressed in different mammalian tissues, including 
white adipose tissue (WAT), brown adipose tissue (BAT), liver, 
hypothalamus, brain, and kidney. The high experession levels of GABARAP 
2 in these tissues indicates, that GABARAP 2 is involved in the metabolism 
of tissues relevant for the metabolic syndrome. The expression of 
GABARAP 2 is under metabolic control: In fasted mice, expression of 
GABARAP 2 is strongely induced in muscle (see Figure 10B). GABARAB-2 
is down regulated during the clonal expansion phase of preadipocyte 
diff erentaition. It is present in the differentiated adipocyte (see Figure 1 0C) . 

As shown in Figure 15A, real time PCR (Taqman) analysis of the 
expression of the PA2G4 RNA in mammalian (mouse) tissues revealed that 
PA2G4 is expressed in different mammalian tissues, including white 
adipose tissue (WAT), brown adipose tissue (BAT), and brain. The high 
experession levels of PA2G4 in these tissues indicates, that PA2G4 is 
involved in the metabolism of tissues relevant for the metabolic syndrome. 
The expression of PA2G4 is under metabolic control: In genetically obese 
(ob/ob) mice, expression of PA2G4 is strongely induced in BAT and heart, 
and in fasted mice, expression of PA2G4 is strongly induced in heart (see 
Figure 15B). 
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As shown in Figure 20A, real time PCR (Taqman) analysis of the 
expression of the Mocs RNA in mammalian (mouse) tissues revealed that 
Mocs is rather ubiquitously expressed in wildtype mice. The expression of 
Mocs in brown adipose tissue is under metabolic control: In genetically 
obese (ob/ob) mice, expression is strongely induced compared to wildtype 
levels (see Figure 20B). 

As shown in Figure 25A, real time PCR (Taqman) analysis of the 
expression of the the Peanut homologous RNA in mammalian (mouse) 
tissues revealed that the Peanut homolog is rather ubiquitously expressed 
in wildtype mice. The expression of the Peanut homolog in brown adipose 
tissue is under metabolic control: In genetically obese (ob/ob) mice, 
expression is strongely induced compared to wildtype levels (see Figure 
25B). 

Example 6: In vitro assays for the determination of triglyceride storage, 
synthesis and transport 

Obesity is known to be caused by different reasons such as non-insulin 
dependent diabetes, increase in triglycerides, increase in carbohydrate 
bound energy and low energy expenditure. For example, an increase in 
energy expenditure (and thus, lowering the body weight) would include the 
elevated utilization of both circulating and intracellular glucose and 
triglycerides, free or stored as glycogen or lipids as fuel for energy and/or 
heat production. The cellular level of triglycerides and glycogen is analyzed 
in cells overexpressing the protein of the invention. 

Preparation of cell lysates for analysis of metabolites 

Starting at confluence (dO), cell media was changed every 48 hours. Cells 
and media were harvested 8 hours prior to media change as follows. Media 
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was collected, and cells were washed twice in PBS prior to lyses in 600 //I 
HB-buffer (0.5% polyoxyethylene 1 0 tridecylethane, 1 mM EDTA, 0.01 M 
NaH 2 P0 4/ pH 7.4). After inactivation at 70 °C for 5 minutes, cell lysates 
were prepared on Bio 101 systems lysing matrix B (0.1 mm silica beads; 
Q-Biogene, Carlsbad, USA) by agitation for 2 x 45 seconds at a speed of 
4.5 (Fastprep FP120, Bio 101 Thermosavant, Holbrock, USA). 
Supernatants of lysed cells were collected after centrifugation at 3000 rpm 
for 2 minutes, and stored in aliquots for later analysis at -80°C. 

Changes in cellular triglyceride levels during adipogenesis 
Cell lysates and media were simultaneously analysed in 96-well plates for 
total protein and triglyceride content using the Bio-Rad DC Protein assay 
reagent (Bio-Rad, Munich, Germany) according to the manufacturer's 
instructions and a modified enzymatic triglyceride kit (GPO-Trinder; Sigma) 
briefly final volumes of reagents were adjusted to the 96-well format as 
follows: 10 //I sample was incubated with 200 fj\ reagent A for 5 minutes 
at 37°C. After determination of glycerol (initial absorbance at 540 nm), 50 
jj\ reagent B was added followed by another incubation for 5 minutes at 
37°C (final absorbance at 540 nm). Glycerol and triglyceride 
concentrations were calculated using a glycerol standard set (Sigma) for 
the standard curve included in each assay. 

Changes in cellular glycogen levels during adipogenesis 
Cell lysates and media were simultaneously analysed in triplicates in 
96-well plates for total protein and glycogen content using the Bio-Rad DC 
Protein assay reagent (Bio-Rad, Munich, Germany) according to the 
manufacturer's instructions and an enzymatic starch kit from Hoffmann-La 
Roche (Basel, Switzerland). 10-^/1 samples were incubated with 20-jc/l 
amyloglucosidase solution for 15 minutes at 60°C to digest glycogen to 
glucose. The glucose is further metabolised with 1 00 jj\ distilled water and 
100 jj\ of enzyme cofactor buffer and 12 //I of enzyme buffer (hexokinase 
and glucose phosphate dehydrogenase). Background glucose levels are 
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determined by subtracting values from a duplicate plate without the 
amyloglucosidase. Final absorbance is determined at 340 nm. HB-buffer as 
blank, and a standard curve of glycogen (Hoffmann-La Roche) were 
included in each assay. Glycogen contents in samples were calculated 
using a standard curve. 

Synthesis of lipids during adipogenesis 

During the terminal stage of adipogenesis (day 1 2) cells were analysed for 
their ability to metabolise lipids. A modified protocol to the method of 
Jensen et al (2000) for lipid synthesis was established. Cells were washed 
3 times with PBS prior to serum starvation in 
Krebs-Ringer-Bicarbonate-Hepes buffer (KRBH; 134 nM NaCI, 3.5 mM KCI, 
1 .2 mM KH 2 P0 4 , 0.5 mM MgS0 4 , 1 .5 mM CaCI 2 , 5 mM NaHC0 3 , 10 mM 
Hepes, pH 7.4), supplemented with 0.1% FCS for 2.5h at 37°C. For 
insulin-stimulated lipid synthesis, cells were incubated with 1 fjM bovine 
insulin (Sigma; carrier: 0.005N HCI) for 45min at 37°C. Basal lipid 
synthesis was determined with carrier only. 14 C(U)-D-glucose (NEN Life 
Sciences) in a final activity of 1/yCi/Well/ml in the presence of 5 mM 
glucose was added for 30 min at 37 °C. For the calculation of background 
radioactivity, 25/jW\ cytochalasin B (Sigma) was used. All assays were 
performed in duplicate wells. To terminate the reaction, cells were washed 
3 times with ice cold PBS, and lysed in 1 ml 0.1N NaOH. Protein 
concentration of each well was assessed using the standard Biuret method 
(Protein assay reagent; Bio-Rad). Total lipids were separated from aqueous 
phase after overnight extraction in Insta-Fluor scintillation cocktail (Packard 
Bioscience) followed by scintillation counting. 

Transport and metabolism of free fatty acids during adipogenesis 
During the terminal stage of adipogenesis (d12) cells were analysed for 
their ability to transport long chain fatty acid across the plasma membrane. 
A modified protocol to the method of Abumrad et al (1991) (Proc. Natl. 
Acad. Sci. USA, 1991: 88; 6008-12) for cellular transportation of fatty 
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acid was established. In summary, cells were washed 3 times with PBS 
prior to serum starvation. This was followed by incubation in KRBH buffer 
supplemented with 0.1 % FCS for 2.5h at 37°C. Uptake of exogenous free 
fatty acids was initiated by the addition of isotopic media containing non 
radioactive oleate and ( 3 H)oleate (NEN Life Sciences) complexed to serum 
albumin in a final activity of lA/Ci/Well/ml in the presence of 5 mM glucose 
for 30min at room temperature (RT). For the calculation of passive 
diffusion (PD) in the absence of active transport (AT) across the plasma 
membrane 20mM of phloretin in glucose free media (Sigma) was added for 
30 min at RT. All assays were performed in duplicate wells. To terminate 
the active transport 20mM of phloretin in glucose free media was added to 
the cells. Cells were lysed in 1 ml 0.1N NaOH and the protein 
concentration of each well were assessed using the standard Biuret 
method (Protein assay reagent; Bio-Rad). Esterified fatty acids were 
separated from free fatty acids using overnight extraction in Insta-Fluor 
scintillation cocktail (Packard Bioscience) followed by scintillation counting. 

Example 7: Glucose uptake assay 

For the determination of glucose uptake, cells were washed 3 times with 
PBS prior to serum starvation in KRBH buffer supplemented with 0.1 % FCS 
and 0.5mM Glucose for 2.5h at 37°C. For insulin-stimulated glucose 
uptake, cells were incubated with 1 //M bovine insulin (Sigma; carrier: 
0.005N HCI) for 45 min at 37 °C. Basal glucose uptake was determined 
with carrier only. Non-metabolizable 2-deoxy-3H-D-glucose (NEN Life 
Science, Boston, USA) in a final activity of 0,4//Ci/Well/ml was added for 
30 min at 37 °C. For the calculation of background radioactivity, 25 //M 
cytochalasin B (Sigma) was used. All assays were performed in duplicate 
wells. To terminate the reaction, cells were washed 3 times with ice cold 
PBS, and lysed in 1 ml 0.1 N NaOH. Protein concentration of each well was 
assessed using the standard Biuret method (Protein assay reagent; 
Bio-Rad), and scintillation counting of cell lysates in 10 volumes 
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Ultima-gold cocktail (Packard Bioscience, Groningen, Netherlands) was 
performed. 

Example 8: Generation and analysis of transgenic mice 
Generation of the transgenic animals 

Mouse cDNA was isolated from mouse brown adipose tissue (BAT) using 
standard protocols as known to those skilled in the art. The cDNA was 
amplified by RT-PCR and point mutations were introduced into the cDNA. 

The resulting mutated cDNA was cloned into a suitable transgenic 
expression vector. The transgene was microinjected into the male 
pronucleus of fertilized mouse embryos (preferably strain C57/BL6/CBA F1 
(Harlan Winkelmann). Injected embryos were transferred into 
pseudo-pregnant foster mice. Transgenic founders were detected by PCR 
analysis. Two independent transgenic mouse lines containing the construct 
were established and kept on a C57/BL6 background. Briefly, founder 
animals were backcrossed with C57/BL6 mice to generate F1 mice for 
analysis. Transgenic mice were continously bred onto the C57/BI6 
background. The expression of the proteins of the invention can be 
analyzed by taqman analysis as described above, and further analysis of 
the mice can be done as known to those skilled in the art. 

All publications and patents mentioned in the above specification are herein 
incorporated by reference. 

Various modifications and variations of the described method and system 
of the invention will be apparent to those skilled in the art without 
departing from the scope and spirit of the invention. Although the invention 
has been described in connection with specific preferred embodiments, it 
should be understood that the invention as claimed should not be unduly 
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limited to such specific embodiments. Indeed, various modifications of the 
described modes for carrying out the invention which are obvious to those 
skilled in molecular biology or related fields are intended to be within the 
scope of the following claims. 
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Claims 

1 . A pharmaceutical composition comprising a nucleic acid molecule of 
the casein kinase 1 gamma, GABARAP, PA2G4, MOCST, CDC10, 
PK, or calreticulin gene family or a polypeptide encoded thereby or 
a fragment or a variant of said nucleic acid molecule or said 
polypeptide or an antibody, an aptamer or another receptor 
recognizing a nucleic acid molecule of the casein kinase 1 gamma, 
GABARAP, PA2G4, MOCS1, CDC10, PK, or calreticulin gene family 
or a polypeptide encoded thereby together with pharmaceutical^ 
acceptable carriers, diluents and/or adjuvants. 

2. The composition of claim 1, wherein the nucleic acid molecule is a 
vertebrate or insect casein kinase 1 gamma, GABARAP, PA2G4, 
MOCS1, CDC10, PK, or calreticulin nucleic acid, particularly 
encoding human casein kinase 1, gamma 1 (SEQ ID NO: 1), human 
casein kinase 1, gamma 2 (SEQ ID NO: 3), human casein kinase 1, 
gamma 3 (SEQ ID NO: 5), human GABARAP (SEQ ID NO: 7), human 
GABARAP like 1 (SEQ ID NO: 9), human GABARAP like 2 (SEQ ID 
NO: 11), human GABARAP like 3 (SEQ ID NO: 13), human PA2G4 
(SEQ ID NO: 15), human MOCSA (SEQ ID NO: 17), human MOCS1 
isoform 1 (SEQ ID NO: 19), human MOCS1 isoform 2 (SEQ ID NO: 
21), human MOCS1 isoform 3 (SEQ ID NO: 23), human CDC10 
(SEQ ID NO: 25), human pyruvate kinase, muscle (SEQ ID NO: 27), 
human pyruvate kinase, liver and RBC (SEQ ID NO: 30), human 
calreticulin (SEQ ID NO: 32), human calreticulin 2 (SEQ ID NO:34), 
or a fragment there of or a variant thereof. 
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3. The composition of claim 1 or 2, wherein said nucleic acid molecule 

(a) hybridizes at 50°C in a solution containing 1 x SSC and 0.1 % 
SDS to a nucleic acid molecule as defined in claim 2 or a 
nucleic acid molecule which is complementary thereto; 

(b) it is degenerate with respect to the nucleic acid molecule of 
(a) 

(c) encodes a polypeptide which is at least 85%, preferably at 
least 90%, more preferably at least 95%, more preferably at 
least 98% and up to 99,6% identical to casein kinase 1, 
gamma 1 (SEQ ID NO: 2), human casein kinase 1, gamma 2 
(SEQ ID NO: 4), human casein kinase 1, gamma 3 (SEQ ID 
NO: 6), human GABARAP (SEQ ID NO: 8), human GABARAP 
like 1 (SEQ ID NO: 10), human GABARAP like 2 (SEQ ID NO: 
12), human GABARAP like 3 (SEQ ID NO: 14), human PA2G4 
(SEQ ID NO: 16), human MOCSA (SEQ ID NO: 18), human 
MOCS1 isoform 1 (SEQ ID NO: 20), human MOCS1 isoform 
2 (SEQ ID NO: 22), human MOCS1 isoform 3 (SEQ ID NO: 
24), human CDC10 (SEQ ID NO: 26), human pyruvate kinase, 
muscle, isozyme M1 (SEQ ID NO: 28), human pyruvate 
kinase, muscle, isozyme M2 (SEQ ID NO: 29), human 
pyruvate kinase, liver and RBC (SEQ ID NO: 31), human 
calreticulin (SEQ ID NO: 33), human calreticulin 2 (SEQ ID 
NO:35), as defined in claim 2; 

(d) differs from the nucleic acid molecule of (a) to (c) by mutation 
and wherein said mutation causes an alteration, deletion, 
duplication or premature stop in the encoded polypeptide. 

4. The composition of any one of claims 1-3, wherein the nucleic acid 
molecule is a DNA molecule, particularly a cDNA or a genomic DNA. 



10 
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5. The composition of any one of claims 1-4, wherein said nucleic acid 
encodes a polypeptide contributing to regulating the energy 
homeostasis and/or the metabolism of triglycerides. 

6. The composition of any one of claims 1-5, wherein said nucleic acid 
molecule is a recombinant nucleic acid molecule, 

7. The composition of any one of claims 1-6, wherein the nucleic acid 
molecule is a vector, particularly an expression vector. 

8. The composition of any one of claims 1-5, wherein the polypeptide 
is a recombinant polypeptide. 



9. The composition of claim 8, wherein said recombinant polypeptide is 
is a fusion polypeptide. 

1 0. The composition of any one of claims 1 -7, wherein said nucleic acid 
molecule is selected from hybridization probes, primers and 
anti-sense oligonucleotides. 

20 

11. The composition of any one of claims 1-10 which is a diagnostic 
composition. 

12. The composition of any one of claims 1-10 which is a therapeutic 
25 composition. 

13. The composition of any one of claims 1-12 for the manufacture of 
an agent for detecting and/or verifying, for the treatment, alleviation 
and/or prevention of an disorders, including metabolic diseases such 

so as obesity and other body-weight regulation disorders as well as 

related disorders such as eating disorder, cachexia, diabetes 
mellitus, hypertension, coronary heart disease, hyper- 
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cholesterolemia, dyslipidemia, osteoarthritis, gallstones, and others, 
in cells, cell masses, organs and/or subjects. 

Use of a nucleic acid molecule of the casein kinase 1 gamma, 
GABARAP, PA2G4, MOCS1, CDC10, PK, or calreticulin gene family 
or a polypeptide encoded thereby or a fragment or a variant of said 
nucleic acid molecule or said polypeptide or an antibody, an aptamer 
or another receptor recognizing a nucleic acid molecule of the casein 
kinase 1 gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, or 
calreticulin gene family or a polypeptide encoded thereby for 
controlling the function of a gene and/or a gene product which is 
influenced and/or modified by a casein kinase 1 gamma, GABARAP, 
PA2G4, MOCS1, CDC10, PK, or calreticulin homologous 
polypeptide. 

Use of the nucleic acid molecule of the casein kinase 1 gamma, 
GABARAP, PA2G4, MOCS1, CDC10, PK, or calreticulin gene family 
or a polypeptide encoded thereby or a fragment or a variant of said 
nucleic acid molecule or said polypeptide or an antibody, an aptamer 
or another receptor recognizing a nucleic acid molecule of the casein 
kinase 1 gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, or 
calreticulin gene family or a polypeptide encoded thereby for 
identifying substances capable of interacting with a casein kinase 1 
gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, or calreticulin 
homologous polypeptide. 

A non-human transgenic animal exhibiting a modified expression of 
a casein kinase 1 gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, 
or calreticulin homologous polypeptide. 
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The animal of claim 1 6, wherein the expression of the casein kinase 
1 gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, or calreticulin 
homologous polypeptide is increased and/or reduced. 

A recombinant host cell exhibiting a modified expression of a casein 
kinase 1 gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, or 
calreticulin homologous polypeptide. 

The ceil of claim 18 which is a human cell. 

A method of identifying a (poly)peptide involved in the regulation of 
energy homeostasis and/or metabolism of triglycerides in a mammal 
comprising the steps of 

(a) contacting a collection of (poly)peptides with a casein kinase 
1 gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, or 
calreticulin homologous polypeptide or a fragment thereof 
under conditions that allow binding of said (poly)peptides; 

(b) removing (poly)peptides which do not bind and 

(c) identifying (poly)peptides that bind to said casein kinase 1 
gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, or 
calreticulin homologous polypeptide. 

A method of screening for an agent which modulates the interaction 
of a casein kinase 1 gamma, GABARAP, PA2G4, MOCS1, CDC10, 
PK, or calreticulin homologous polypeptide with a binding 
target/agent, comprising the steps of 
(a) incubating a mixture comprising 

(aa) a casein kinase 1 gamma, GABARAP, PA2G4, MOCS1 , 

CDC10, PK, or calreticulin homologous polypeptide, or 

a fragment thereof; 
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(ab) a binding target/agent of said casein kinase 1 gamma, 
GABARAP, PA2G4, MOCS1 , CDC10, PK, or calreticulin 
homologous polypeptide or fragment thereof; and 

(ac) a candidate agent 

under conditions whereby said casein kinase T gamma, 
GABARAP, PA2G4, MOCS1, CDC10, PK, or calreticulin 
polypeptide or fragment thereof specifically binds to said 
binding target/agent at a reference affinity; 

(b) detecting the binding affinity of said casein kinase 1 gamma, 
GABARAP, PA2G4, MOCS1, CDC10, PK, or calreticulin 
polypeptide or fragment thereof to said binding target to 
determine an (candidate) agent-biased affinity; and 

(c) determining a difference between (candidate) agent-biased 
affinity and the reference affinity. 

22. A method of screening for an agent which modulates the activity of 
a casein kinase 1 gamma, GABARAP, PA2G4, MOCS1, CDC10, PK, 
or calreticulin homologous polypeptide comprising the steps of 

(a) incubating a mixture comprising 

(aa) a casein kinase 1 gamma, GABARAP, PA2G4, MOCS1, 
CDC10, PK, or calreticulin homologous polypeptide, or 
a fragment thereof, and 

(ab) a candidate agent 

under conditions whereby said casein kinase 1 gamma, 
GABARAP, PA2G4, MOCS1, CDC10, PK, or calreticulin 
polypeptide or fragment thereof has a reference activity; 

(b) detecting the activity of said casein kinase 1 gamma, 
GABARAP, PA2G4, MOCS1, CDC10, PK, or calreticulin 
polypeptide or fragment thereof to determine an (candidate) 
agent-biased activity and 

(c) determining a difference between (candidate) agent-biased 
activity and reference activity. 
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23. A method of producing a composition comprising the (poly)peptide 
identified by the method of claim 20 or the agent identified by the 
method of claim 21 or 22 with a pharmaceutically acceptable 
carrier, diluent and/or adjuvant. 

24. The method of claim 23 wherein said composition is a 
pharmaceutical composition for preventing, alleviating or treating of 
diseases and disorders, including metabolic diseases such as obesity 
and other body-weight regulation disorders as well as related 
disorders such as eating disorder, cachexia, diabetes mellitus, 
hypertension, coronary heart disease, hypercholesterolemia, 
dyslipidemia, osteoarthritis, gallstones, and other diseases and 
disorders. 

25. Use of a (poly)peptide as identified by the method of claim 20 or of 
an agent as identified by the method of claim 21 or 22 for the 
preparation of a pharmaceutical composition for the treatment, 
alleviation and/or prevention of diseases and disorders, including 
metabolic diseases such as obesity and other body-weight regulation 
disorders as well as related disorders such as eating disorder, 
cachexia, diabetes mellitus, hypertension, coronary heart disease, 
hypercholesterolemia, dyslipidemia, osteoarthritis, gallstones, and 
other diseases and disorders. 



25 26. Use of a nucleic acid molecule of the casein kinase 1 gamma, 
GABARAP, PA2G4, MOCS1, CDC10, PK, or calreticulin family or of 
a fragment thereof for the preparation of a non-human animal which 
over- or under-expresses the casein kinase 1 gamma, GABARAP, 
PA2G4, MOCS1, CDC10, PK, or calreticulin gene product. 

30 

27. Kit comprising at least one of 
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(a) a casein kinase 1 gamma, GABARAP, PA2G4, MOCS1 , 
CDC1 0, PK, or calreticulin nucleic acid molecule or a fragment 
thereof; 

(b) a vector comprising the nucleic acid of (a); 

5 (c) a host cell comprising the nucleic acid of (a) or the vector of 

(b); 

(d) a polypeptide encoded by the nucleic acid of (a); 

(e) a fusion polypeptide encoded by the nucleic acid of (a); 

(f) an antibody, an aptamer or another receptor against the 
10 nucleic acid of (a) or the polypeptide of (d) or (e) and 

(g) an anti-sense oligonucleotide of the nucleic acid of (a). 
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FIGURE 3: HUMAN HOMOLOG OF CG6963 (gilgamesh) 

FIGURE 3A. BLAST? result for gilgamesh (Gadfly Accession Number CG6963) 

Homology to human gene ref XMJ)49422; protein ref XP_049422.2 

>ref |XP_049422.2 | (XM_049422) casein kinase 1, gamma 3 [Homo sapiens] 
Length =447 

Score = 628 bits (1601), Expect = e-178 

Identities = 307/426 (72%), Positives = 356/426 (83%), Gaps = 10/426 (2%) 

Query: 2 YSTRQSVSTTTG^LMVGPNFRVGK^ 61 

++TR + S+++GVLMVGPNFRVGKKIGCGNFGELRLGKNLY NE+VAIK+EPMKS+APQL 
Sbjct: 24 HNTRGTGS S S S GVLMVGPNFRVGKKI GCGNFGELRLGKNL YTNEYVAIKLE PMKS RAPQL 83 

Query: 62 HLEYRFYKLLGSHAEGVPE^TYYFGPCGKYNALVMELLGPSLEDLFDICGRRFTLKSVLLI 121 

HLEYRFYK LGS +G+P+VYYFGPCGKYNA4-V+ELLGPSLEDLFD+C R F+LK+VL+I 
Sbjct: 84 HLEYRFYKQLGS-GDGIPQVYYFGPCGKYNAMVLELLGPSLEDLFDLCDRTFSLKTVLMI 142 

Query: 122 AIQLLHRIEYVHSRHLIYRDVKPENFLIGRTSTKREKIIHIIDFGLAKEYIDLDTNRHIP 181 

AIQL+ R+EYVHS++LIYRDVKPENFLIGR K + + + IHI IDFGL AKEYID +T +HIP 
Sbjct: 143 AIQLISRMEYVHSKNLIYRDVKPENFLIGRPGISKTQQVIHIIDFGLAKEYIDPETKKHIP 202 

Query: 182 YREHKSLTGTARYMSINTHMGREQSRRDDLEALGHMFMYFLRGSLPWQGLKADTLKERYQ 241 

YREHKSLTGTARYMSINTH+G+EQSRRDDLEALGHMFMYFLRGSLPWQGLKADTLKERYQ 
Sbjct: 203 YREHKSLTGTARYMSINTHLGKEQSRRDDLEALGH 262 

Query: 242 KIGDTKRATPIEVLCDGHPEEFATYLRYVRRLDFFETPDYDFLRRLFQDLFDRKGYTDEG 301 

KIGDTKRATPIEVLC+ P E ATYLRYVRRLDFFE PDYD+LR+LF DLFDRKGY + 
Sbjct: 263 KIGDTKRATPIEVLCENFP-EMATYLRYVRRLDFFEKPDYDYLRKLFTDLFDRKGYMFDY 321 

Query: 3 02 EFDWTGKTMST PVGSLQTGHEVI I S PNKDRHN VTAKTNAKGGVAAWPDVPKPGAT 356 

E+DW GK + TPVG++Q + +S N++ H +K + AAW 

Sbjct: 322 EYDWIGKQLPTPVGAVQ--QDPALSSNREAHQHRDKMQQSK1^QSADHRAAWDSQQANPHH 379 

Query: 357 LGNLTPADRH-GSVQWSSTNGELNPDDPTAGHSNTPITQQPEVEVVDETKCCCFFKRKK 415 

L ADRH GSVQWS STNGELN DDPTAG SN PIT EVEV+DETKCCCFFKR+K 

Sb j c t : 3 80 LRAHLAADRHGGSVQWS STNGELNTDDPTAGRSNAPITAPTEVEVMDETKCCCFFKRRK 439 

Query: 416 KKSTRQ 421 

Sbjct: 440 RKTIQR 445 
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FIGURE 3B: Predicted nucleotide sequence encoding human casein kinase 1, gamma 1 
(SEQIDNO:l) 

1 aactccttac ctttctctga ctacaattta tttggacata cttttgtatt gaagagaggt 

61 atacatactg aagctacttg ctgtactata ggagactctg tcctgtagga tcatggacca 

121 tcctagtagg gaaaaggatg aaagacaacg gacaactaaa cccatggcac aaaggagtgc 

181 acactgctct cgaccatctg gctcctcatc gtcctctggg gttcttatgg tgggacccaa 

241 cttcagggtt ggcaagaaga taggatgtgg gaacttcgga gagctcagat taggtaaaaa 

3 01 tctctacacc aatgaatatg tagcaatcaa actggaacca ataaaatcac gtgctccaca 

3 61 gcttcattta gagtacagat tttataaaca gcttggcagt gcaggtgaag gtctcccaca 

421 ggtgtattac tttggaccat gtgggaaata taatgccatg gtgctggagc tccttggccc 

481 tagcttggag gacttgtttg acctctgtga ccgaacattt actttgaaga cggtgttaat 

541 gatagccatc cagctgcttt ctcgaatgga atacgtgcac tcaaagaacc tcatttaccg 

601 agatgtcaag ccagagaact tcctgattgg tcgacaaggc aataagaaag agcatgttat 

661 acacattata gactttggac tggccaagga atacattgac cccgaaacca aaaaacacat 

721 accttatagg gaacacaaaa gtttaactgg aactgcaaga tatatgtcta tcaacacgca 

781 tcttggcaaa gagcaaagcc ggagagatga tttggaagcc ctaggccata tgttcatgta 

841 tttccttcga ggcagcctcc cctggcaagg actcaaggct gacacattaa aagagagata 

9 01 tcaaaaaatt ggtgacacca aaaggaatac tcccattgaa gctctctgtg agaactttcc 

961 agaggagatg gcaacctacc ttcgatatgt caggcgactg gacttctttg aaaaacctga 

1021 ttatgagtat ttacggaccc tcttcacaga cctctttgaa aagaaaggct acacctttga 

1081 ctatgcctat gattgggttg ggagacctat tcctactcca gtagggtcag ttcacgtaga 

1141 ttctggtgca tctgcaataa ctcgagaaag ccacacacat agggatcggc catcacaaca 

12 01 gcagcctctt cgaaatcagg tggttagctc aaccaatgga gagctgaatg ttgatgatcc 
1261 cacgggagcc cactccaatg caccaatcac agctcatgcc gaggtggagg tagtggagga 
1321 agctaagtgc tgctgtttct ttaagaggaa aaggaagaag actgctcagc gccacaagtg 

13 81 accagtgcct cccaggagtc ctcaggccct ggggactctg actcaattgt acctgcagca 
1441 tttctcattg gaaggggact cctctttggg ggagggtgga tatccaaacc aaaaagaaga 
1501 aaacagatgc ccccagaagg gggccagtgc gggcagccag ggcctagtgg gtcattggcc 
1561 atctccgctg ctaaggctct gagcaggtcc agagctgctg ttcctccact gcttgcccat 
1621 agggctgcct ggttgactct cttccattg 



FIGURE 3C: Predicted amino acid sequence of human casein kinase 1, gamma 1 (SEQ 
ID NO:2) 

1 mdhpsrekde rqrttkpmaq rsahcsrpsg sssssgvlmv gpnfrvgkki gcgnfgelrl 
61 gknlytneyv aiklepiksr apqlhleyrf ykqlgsageg lpqvyyf gpc gkynamvlel 
121 lgpsledlfd Icdrtftlkt vlmiaiqlls rmeyvhsknl iyrdvkpenf ligrqgnkke 
181 hvihiidfgl akeyidpetk khipyrehks ltgtarymsi nthlgkeqsr rddlealghm 
241 fmyflrgslp wqglkadtlk eryqkigdtk rntpiealce nfpeematyl ryvrrldffe 
3 01 kpdyeylrtl ftdlfekkgy tfdyaydwvg rpiptpvgsv hvdsgasait reshthrdrp 
3 61 sqqqplrnqv vsstngelnv ddptgahsna pitahaevev veeakcccff krkrkktaqr 
421 hk 



FIGURE 3D: Predicted nucleotide sequence encoding human casein kinase 1, gamma 2 
(SEQ ID NO:3) 

1 gggatttgca cggcagcaga gtcaccgtgg agaggccagg gtatcacaaa cttatggatt 

61 ttgacaagaa aggagggaaa ggggagacgg aggagggccg gagaatgtcc aaggccggcg 

121 ggggccggag cagccacggc atccggagct cggggaccag ctcgggggtc ctgatggtgg 

181 gccccaactt ccgcgtcggc aagaagatcg gctgcggcaa cttcggggag ctccgcctag 

241 gaaagaatct ctatacaaat gaatacgtgg ctatcaaatt ggagccgatc aagtcccggg 
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301 
361 
421 
481 
541 
601 
661 
721 
781 
841 
901 
961 
1021 
1081 
1141 
1201 
1261 
1321 
1381 
1441 
1501 
1561 
1621 
1681 
1741 
1801 
1861 
1921 
1981 
2041 
2101 
2161 
2221 
2281 
2341 
2401 



ccccgcagct 
ctcaggtcta 
ggcccagcct 
tgatgatcgc 
accgggacgt 
ccatccacat 
acatcccgta 
cgcacctggg 
tgtacttcct 
ggtaccagaa 
tcccagagga 
ccgactatga 
tcgactatga 
ccgacctgcc 
cgttgaactc 
ccccgatcac 
tcaagaggag 
cctgaatctt 
cccacagcgg 
gcctggctca 
gccgaaattt 
aggaaaagag 
ttgctgaagt 
cccccgttag 
gggagggcag 
agggggacag 
gggaaggcag 
gctcgggcgc 
gggtgggcgc 
cgcctggaac 
cgcgtggggc 
gagacaaaac 
gcggccctgg 
ggagcagctt 
tgtacagaaa 
ttttttaaaa 



gcacctggag 
ctacttcggt 
ggaggacctg 
catccagctg 
gaagcccgag 
catcgacttc 
ccgcgagcac 
caaggagcag 
gcgcggcagc 
gatcggggac 
gatggccacg 
ctacctgcgg 
gtacgactgg 
ctcccagcct 
caccaacggg 
agcgcctgca 
aaagagaaaa 
ctccgtgcag 
cccagggcca 
ggcggcccca 
ctacacctgt 
gaaaaaaaaa 
gagtagtgtg 
cgtcataaag 
gcccggcctg 
gcattgttgc 
gggtggggcc 
cgcgggcacg 
tgccaggcgg 
ccgaggtggg 
gcgtgggcac 
gccttaaagc 
actggcgggc 
ggggccgtgc 
tggcatttac 
aaataaaaga 



taccggttct 
ccgtgcggga 
ttcgacctgt 
atcacgcgca 
aacttcctgg 
gggctggcca 
aagagcctga 
agccgccgcg 
ctcccctggc 
accaaacgcg 
tacctgcgct 
aagctcttca 
gccgggaagc 
cagctccggg 
gagctgaatg 
gaggtggagg 
tcgctgcagc 
ccccttgggg 
gaccctggct 
cccccgggac 
gtctagtcct 
acagaggccc 
atcctggagg 
tccagcttgt 
gaggggtgct 
caggggtgag 
gtggctgaag 
gctgctgcag 
gtgcttctcg 
aggaccggtt 
ggagcttcct 
ccccggccca 
ggttccccag 
ccagggcggt 
gtttctctga 
aaaatgaaac 



acaagcagct 
attacaacgc 
gcgaccggac 
tggagtatgt 
tgggccgccc 
aggagtacat 
cgggcacggc 
acgacctgga 
aggggctcaa 
ccacgcccat 
atgtgcggcg 
ccgacctctt 
ccctgccgac 
acaaaaccca 
cggacgaccc 
tggccgatga 
gacacaagtg 
cgcgaccttg 
ggaagccaga 
gtggggtcac 
cccctccaag 
gccctacccc 
ccccccggcc 
ctccctcgat 
gtggagctgt 
gccgtgcccc 
ccggctcccc 
tctcttccca 
acgcacttgc 
ggtgtcaccc 
gcctctgctc 
gccctgcagg 
tggggtgccc 
ggctgtgagt 
tgctcccttg 
caaaaaaaaa 



cagcgccaca 
catggtgctg 
cttcacgctc 
gcacaccaag 
ggggaccaag 
cgaccccgag 
gcgctacatg 
ggcgctgggc 
ggccgacacg 
cgaggtgctc 
cctggacttc 
cgaccgcagt 
ccccatcggc 
gccgcacagc 
cacggccggc 
aaccaaatgc 
accctgggcg 
tgcgaggccc 
acgcagactg 
ttccttcatg 
agcattaact 
actcctgccc 
tggccccgcc 
ccaaaggccg 
cttgcccagg 
aggcctcccc 
aaccaaaatg 
gcctggccct 
tcccggaggc 
tgctcggccc 
cgacacccgg 
tatattgcag 
tggaggctgc 
ctagtttttg 
aagccataga 
aaaaaa 



gagggcgtcc 
gagctgctgg 
aagacggtgc 
agcctaatct 
cggcagcatg 
accaagaagc 
agcatcaaca 
cacatgttca 
ctcaaggagc 
tgcgagaact 
ttcgagaagc 
ggcttcgtgt 
accgtccaca 
aaaaaccagg 
cactccaacg 
tgctgtttct 
cgtgcagccc 
tcggggccca 
caggggccgc 
taagactttg 
atttaaaaca 
ctccgtttct 
ccgccagccg 
ttttctcgag 
ccctcctggg 
gaaaccaaag 
ctgcaccaaa 
ggcaaggggc 
tgcgccccgg 
tcagccctgc 
caagcagccg 
gggcctgggg 
cgggcagagt 
ctttaccaag 
atttaggggc 



FIGURE 3E: Predicted amino acid sequence of human casein kinase 1, gamma 2 (SEQ 
ID NO:4) 

1 mdfdkkggkg eteegrrmsk agggrsshgi rssgtssgvl mvgpnfrvgk kigcgnfgel 
61 rlgknlytne yvaiklepik srapqlhley rfykglsate gvpqvyyfgp cgnynamvle 
121 llgpsledlf dlcdrtftlk tvlmiaigli trmeyvhtks liyxdvkpen flvgrpgtkr 
181 qhailiiidfg lakeyidpet kkhipyrehk sltgtaryms intlilgkeqs rrddlealgh 
241 mfmyflrgsl pwqglkadtl keryqkigdt kratpievlc enfpeematy lryvrrldff 
3 01 ekpdydylrk lftdlfdrsg fvfdyeydwa gkplptpigt vhtdlpsqpq Irdktqphsk 
3 61 nqalnstnge Inaddptagh snapitapae vevadetkcc cffkrrkrks lqrhk 



FIGURE 3F: Predicted nucleotide sequence encoding human casein kinase 1, gamma 3 
(SEQ ID NO:5) 

1 gaattcaaag tggagtaccg caaacttgat atggaaaata aaaagaaaga caaggacaaa 
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61 
121 
181 
241 
301 
361 
421 
481 
541 
601 
661 
721 
781 
841 
901 
961 
1021 
1081 
1141 
1201 
1261 
1321 
1381 
1441 
1501 
1561 
1621 
1681 
1741 
1801 
1861 
1921 
1981 
2041 
2101 
2161 
2221 
2281 
2341 
2401 



tcagatgata 
tcttcatcgt 
tgtggcaatt 
attaagttgg 
aagcagttag 
tacaatgcta 
gacagaacat 
gaatatgtcc 
ggacgaccaa 
gaatatattg 
ggaacagcta 
gatttagaag 
ggcttaaagg 
acaccaatag 
agaaggctag 
ttgtttgatc 
cctactccag 
caacacagag 
gactcccagc 
ggctcggtac 
ggacgttcaa 
tgctgctgct 
acacagacag 
tctctgtagt 
atcatacttc 
cctfctatgga 
atggaccaat 
ctgttgtaga 
tfctataactt 
acagagaaaa 
ttgtgacatc 
acattctctc 
tgggttcggc 
gcttttgttg 
taaaaatgca 
acagcaagaa 
aggagaggta 
caccgaaact 
atgcaatggt 
aaaagtgggt 



gaatggcacg 
ctggagtttt 
ttggagaatt 
agcccatgaa 
gatctggaga 
tggtgctgga 
tttctcttaa 
attcaaagaa 
gaaacaaaac 
atccggagac 
gatatatgag 
ctttaggtca 
ctgacacatt 
aagtgttatg 
atttttttga 
gaaaaggata 
tgggtgcagt 
ataagatgca 
aggcaaatcc 
aggttgtaag 
atgcacccat 
ttttcaaacg 
atcctgggga 
gaccacgtat 
attttgtggt 
agctttaaag 
gaggtggtat 
aagcagtaca 
ctgtaagagc 
cagagttgcc 
tgcattgatt 
cgtgaaatca 
agatggtggt 
catcactatg 
aaatatttgt 
atgaatactt 
gtaaatgtga 
tactgctgaa 
gtctgtgtaa 
cactctatgc 



acctagtggt 
aatggttgga 
acgattaggg 
atcaagagca 
tggtatacct 
actgctggga 
aacagttctc 
cttgatatac 
ccagcaagtt 
aaagaaacac 
cataaacaca 
tatgttcatg 
aaaggagagg 
tgaaaatttt 
aaaaccagac 
tatgtttgat 
tcagcaagat 
acaatccaaa 
ccaccatttg 
ttctacaaat 
cacagcccct 
aaggaaaagg 
gttacttaca 
attttcaagg 
tgtcttacat 
ttttgtcaaa 
caatgaatat 
gtatcttaag 
ataatcaaac 
caaatattta 
tcagtattac 
tggtacagtc 
ggtaaaatga 
tgaagtactg 
acaatgtaac 
taaaaagtga 
accttgttgc 
tgattacatt 
ttattttgct 



cgatcgggac 
cctaacttta 
aaaaatttat 
ccacagctac 
caagtttact 
cctagtttgg 
atgatagcta 
agagatgtaa 
attcacatta 
ataccataca 
catttaggaa 
tattttctga 
tatcagaaaa 
ccagaaatgg 
tatgaatact 
tatgaatatg 
cctgctctgt 
aaccagtcgg 
agagctcacc 
ggagagttaa 
actgaagtag 
aaaaccatac 
tgttcatctg 
actcactctt 
tctttttctt 
acatgagtgc 
agttccatag 
tgtcaaccag 
aggaattttc 
aaagaagtta 
tgatggtact 
actgcccaga 
atcttaagga 
tgttgcagaa 
tttatgcttc 
tatatgttgg 
agtgtataag 
ctcccttaag 
ctttgattaa 



acaacactcg 
gagttggaaa 
acacaaatga 
atttggaata 
atttcggccc 
aagacttgtt 
tacaactgat 
aacctgagaa 
tagattttgg 
gagaacacaa 
aagaacaaag 
gaggcagtct 
ttggagatac 
caacatatct 
taagaaagct 
actggattgg 
catcaaacag 
cagaccacag 
ttgcagcaga 
acacagatga 
aagtgatgga 
agcgccacaa 
ctgtcttgtg 
agaaacaaaa 
tttttttttc 
tttgcccatc 
aacattttcc 
ttatatacct 
ttttctcagt 
ttccttgaga 
gttattcata 
ggtactgagg 
gtgtggtaaa 
gtggcaaaag 
caaataataa 
agttataaag 
gtggaagcct 
cagaaaactt 
aaaaaagacc 



aggaactggg 
aaaaattgga 
atatgtggca 
cagattctat 
ttgtggtaaa 
tgacttgtgt 
ttctcgcatg 
cttcttaata 
tttggcaaag 
gagccttaca 
tagaagagac 
tccttggcaa 
aaaacgggct 
tcgttatgta 
ttttactgac 
taaacagttg 
agaagcacat 
ggcagcttgg 
cagacatggt 
ccccaccgca 
tgaaaccaag 
atgactctgg 
attaaaatca 
atgtcatact 
tctaatttaa 
agtgaatgga 
agaagttctt 
aatctggttt 
ggataataca 
agttcatatt 
agtcatatta 
aaaagcaata 
tatgtgctcc 
cgcttatttt 
tgtatgttag 
aaatacacta 
aaagaaatct 
tggatgtgcc 
cccagcaata 



FIGURE 3G: Predicted amino acid sequence of human casein kinase 1, gamma 3 (SEQ 
IDNO:6) 

1 menkkkdkdk sddrmarpsg rsghntrgtg ssssgvlmvg pnfrvgkkig cgnfgelrlg 
61 knlytneyva iklepmksra pqlhleyrfy kqlgsgdgip qvyyfgpcgk ynamvlellg 
121 psledlfdlc drtfslktvl miaiqlisrm eyvhsknliy rdvkpenfli grprnktqgv 
181 ihiidfglak eyidpetkkh ipyrehkslt gtarymsint hlgkeqsrrd dlealghmfm 
241 yflrgslpwq glkadtlker yqkigdtkra tpievlcenf pematylryv rrldffekpd 
301 yeylrklftd lfdrkgymfd yeydwigkql ptpvgavqqd palssnreah qhrdkmqqsk 
361 nqsadhraaw dsqqanphhl rahlaadrlig gsvqwsstn gelntddpta grsnapitap 
421 tevevmdetk cccffkrrkr ktiqrhk 
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FIGURE 4. CLUSTAL W (1.83) Protein Sequence Alignment Analysis 

CKl g3 Hs MENKKK DKDKSDDRMARP SGRSGHNTRGTGS S S S - GVLMVGPNFRVGKKI GCGNFG 

CKl gl Hs MDHPSR EKDERQRTTKPMAQRS AHC SRP S GS S S S S GVLMVGPNFRVGKKI GCGNFG 

CKl g2 Hs MDFDKKGGKGETEEGRRMSKAGGGRSSHGIRSSGTSSG — VLMVGPNFRVGKKI GCGNFG 

CG6963 Dm MY STRQSVSTTT- GVLMVGPNFRVGKKI GCGNFG 

CKl g3 Hs ELRLGKNLYTNEYVAIKLEPMKSRAPQLHLEYRFYKQLGS-GDGIPQVYYFGPCGKYNAM 

CKl gl Hs ELRLGKNLYTNEWAIKLEPIKSRAPQLHLEYRFYKQLGSAGEGLPQVYYFGPCGKYNAM 

CKl g2 Hs ELRLGKNLYTNEYVAIKLEPIKSRAPQLHLEYRFYKQLSA-TEGVPQVYYFGPCGNYNAM 

CG69 63 Dm ELRLGKIKTLYlSnNEHVAIKMEPMKSKAPQLHLEY 

CKl g3 Hs VLELLGPSLEDLFDLCDRTFSLKTVLMIAIQLISRMEYVHSKNLIYRDVKPENFLIGRPR 

CKl gl HS VLELLGPSLEDLFDLCDRTFTLKTVLMIAIQLLSRMEYVHSKNLIYRDVKPENFLIGRQG 

CKl g2 Hs VLELLGPSLEDLFDLCDRTFTLKTVLMIAIQLITRMEYVHTKSLIYRDVKPENFLVGRPG 

CG6963 Dm WELLGPSLEDLFDICGRRFTLKSVLLIAIQLLHRIEYVHSRHLIYRDVKPENFLIGRTS 



CKl g3 Hs NKTQQVIHIIDFGLAKEYIDPETKKHIPYREHKSLTGTARYMSINTHLGKEQSRRDDLEA 
CKl gl Hs NKKEHVIHI IDFGL AKEYIDPETKKHI P YREHKSLTGTARYMS INTHLGKEQSRRDDLEA 
CKl g2 Hs TKRQHAI HI IDFGL AKEYIDPETKKHI P YREHKSLTGTARYMS INTHLGKEQSRRDDLEA 
CG6963 Dm TKREKIIHIIDFGLAKEYIDLDTNRHIPYREHKSLTGTARYMSINTHMGREQSRRDDLEA 

CKl g3 HS LGHMFMYFLRGSLPWQGLKADTLKERYQKIGDTKRATPIEVLCENFP-EMATYLRYVRRL 
CKl gl Hs LGHMFMYFLRGSLPWQGLKADTLKERYQKIGDTKRNTPIEALCENFPEEMATYLRYVRRL 
CKl g2 Hs LGHMFMYFLRGS L PWQGLKADTLKERYQKI GDTKRAT P I EVLCENF PEEMAT YLRYVRRL 
CG6963 Dm LGHMFMYFLRGSLPWQGLKADTLKERYQKIGDTKRATPIEVLCDGHPEEFATYLRYVRRL 

CKl g3 Hs DFFEK PD YE YLRKL FTDLFDRKG YMFD YE YDWI GKQL PT PVGAVQQD P - ALS SN- REAHQ 
CKl gl Hs DFFEK PDYEYLRTLFTDLFEKKGYTFDYAYDWVGRPIPTPVGSVHVDSGASAIT-RESHT 
CKl g2 Hs DFFEKPDYDYLRKLFTDLFDRSGFVFDYEYDWAGKPLPTPIGTVHTDLPSQPQL-RDKTQ 
CG69 63 Dm DFFETPDYDFLRRLFQDLFDRKGYTDEGEFDWTGKTMSTPVGSLQTGHEVIISPNKDRHN 



CKl g3 Hs HRDKMQQSKNQSADHRAAWDSQQANPHHLRAHLAADRHGGSVQWSSTNGELNTDDPTAG 

CKl gl Hs HRDRPSQ QQP LRN QWS S TNGELNVDD PTGA 

CKl g2 Hs PHS KN QALNS TNGELNADD PTAG 

CG69 63 Dm VTAKTNAKGG VAAWPDVPKPGATLGNLTPADRHG- SVQWS STNGELNPDDPTAG 



CKl g3 Hs RSNAPITAPTEVEVMDETKCCCFFKRRKRKTIQRHK 
CKl gl Hs HSNAPITAHAEVEWEEAKCCCFFKRKRKKTAQRHK 
CKl g2 Hs HSNAPITAPAEVEVADETKCCCFFKRRKRKSLQRHK 
CG6963 Dm HSNTPITQQPEVEWDETKCCCFFKRKKKKSTRQK- 
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FIGURE 6. Triglyceride levels of a CG1534 (Gadfly Accession Number) mutant 



1,4 




PX lines (n=1 25) PX6298.1 (GABARAP) 
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FIGURE 8: HUMAN HOMOLOG OF CG1534 



FIGURE 8A. BLAST? search result for CG1534 (Gadfly Accession Number) 

>gi|6005764|ref | NP„009209 . 1 1 GAB A (A) receptor-associated protein [Homo 
sapiens] 

gi | 97899 61 | ref | NP_062723 . 1 | gamma -aminobutyric acid reseptor associated 
protein; GAB A- A receptor-associated protein [Mus musculus] 

Query: Drosophila: CG1534 gene product, 121 amino acids; Human refers to 
NP_009209 .1; 

Score = 200 bits (508), Expect = 2e-51; Identities = 107/117 (91%), 
Positives = 113/117 (96%) 

Droso: 1 MKFQYKEEHAFEKRRAEGDKIRRKYPDRVPVIVEKAPKARIGDLDKKKYLVPSDLTVGQF 60 

MKF YKEEH F EKRR+ EG+ KI R+ K YPDRVPVI VEKAPKARI GDLDKKK YLVP SDLT VGQF 
human: 1 MKFVYKEEHPFEKRRSEGEKIRKKYPDRVPVIVEKAPKARIGDLDKKKYLVPSDLTVGQF 60 

Query: 61 YFLIRKRIHLRPEDALFFFV1SIWIPPTSATMG 117 

YFLIRKRIHLR EDALFFFVHNVIPPTSATMG L YQEHHEED+ FL YI AYSDE+ VYG+ 
Sbjct: 61 YFLIRKRIHLRAEDALFFFVlSnSlVIPPTSATMGQLYQEHHEEDFFLYIAYSDESVYGri 117 



FIGURE 8B: Predicted nucleotide sequence encoding human GABA(A) receptor- 
associated protein (SEQ ID NO: 7) 

1 gctccgctga atccgcccgc gcgtcgccgc cgtcgtcgcc gccccccgtc ccggcccccc 
61 tgggttccct cagcccagcc ctgtccagcc cggttcccgg gaggatgaag ttcgtgtaca 
tccgttcgag aagcgccgct ctgagggcga gaagatccga aagaaatacc 
gccggtgata gtagaaaagg ctcccaaagc tcggatagga gacctggaca 
cctggtgcct tctgatctca cagttggtca gttctacttc ttgatccgga 
gaggatgcct tgtttttctt tgtcaacaat gtcattccac 
cagctgtacc aggaacacca 
agtgtctacg gtctgtgaag 
agagaggtgg cccccctttc 
attcaggacc ggcacttctt 
aatggtggag ttggcatctt 
ctttcccatc ctgctgtaga 



121 aagaagagca 
181 cggaccgggt 
241 aaaagaaata 
301 agcgaattca 
3 61 ccaccagtgc 
acattgccta 

gggggtctca 

gctcaaacac 
cagcctctct 
cttccccttt 
gtcacatcca 
ccccagcaat 
ttgtgggggg 
aaaaaaaaaa 



421 
481 
541 
601 
661 
721 
781 
841 
901 



tctccgagct 
cacaatgggt 
cagtgacgaa 
ttctacaaag 
cacctccctt 
taggaggggt 
ctctgcccgc 
gtgattgttt 
cccttccttt 
aaggtaggag 
aaaaaaaaaa 



tgaagaagac ttctttctct 
ctgctgcccc tgagctggag 
ttgacctcct cctccttcaa 
aatgtttgtg gctttctctc 
gtaactctcc tttctccttt 
cttcttgatt gtcagtctgt 



tggtttctgt tccctttctg actgcccaag gggctcagaa 

cactaccttc ttttttgggg gtagttggaa gggactgaaa 

gcacatcaat aaagaggaaa ccaccaagct gaaaaaaaaa 
aaaa 



FIGURE 8C: Predicted amino acid sequence of human GABA(A) receptor-associated 
protein (SEQ ID NO:8) 

1 mkfvykeehp fekrrsegek irkkypdrvp vivekapkar igdldkkkyl vpsdltvgqf 
61 yflirkrihl raedalfffv nnvipptsat mgqlyqehhe edfflyiays desvygl 



FIGURE 8D: Predicted nucleotide sequence encoding human GABARAP like 1 (SEQ 
IDNO:9) 

1 cgtcacagcc cgacgcgcca cccagctgtt tttgtgctca caagctctag cgaaaagccg 
61 ccggtatttc tccatctggc tctcctctac ctccaggcag gctcacccga gatccccgcc 
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121 ccgaaccccc cctgcacact cggcccagcg 
181 ctattctgag cacaccttga cgtcggctga 
241 gcaggccccg cgcggggatc tcggaagccc 
3 01 aggaccatcc ctttgagtat cggaaaaagg 
3 61 acagggtccc cgtgattgta gagaaggctc 
421 ggaagtacct agtgccctct gaccttactg 
481 gaatccacct gagacctgag gacgccttat 
541 ccagtgctac catgggccaa ctgtatgagg 
601 tggcctacag tgatgagagt gtctatggga 
661 gcacctggac ttgggggtag gggaggggtg 
721 tcccaccgca aggagacaga aggtgaagac 
781 atcacatttt cacatgctca attgatattt 
841 atgtcaggac agagctgttg gattggcttt 
9 01 agtattcctg gggtttaatt gttgtgcagt 
9 61 ggggccagag atgatggcag tccagcagca 
1021 gattctattt ttgacatttg cacaagacag 
1081 atacctgggg accaaaagag acccactgta 
1141 gtctcacact tcttttctcc catcccggtt 
12 01 ccaggggtgg cagtagacaa caacccagaa 
1261 ataggggtta ggcatgaagg tggttgtgat 
1321 taaactggaa ttgacaagag tgttgagcat 
1381 ccccttatct caccccttcc ttggaattta 
1441 aaaaccactc ttagcatctc ctctagtatt 
1501 tagggagggg gcaagtatga agtaaggtaa 
1561 tcccatgctg ctgtcccttc aggctcacat 
1621 ttccctcctt ggttatcatc cactgcagct 
1681 ttagtaaatc atggggattt tattgattta 
1741 gtggggagca ggaattgcac tcagacatga 
1801 gttctttctc ttgggggaaa tgtgtgtgtc 
1861 gaagtcaatg ccatcaggcc aaggaaataa 
1921 aaaaaaaaaa aaa 



ctgttgcccc cggagcggac gtttctgcag 
gggagcggga cagggtcagc ggcgaaggag 
tgcggtgcat catgaagttc cagtacaagg 
aaggagaaaa gatccggaag aaatatccgg 
caaaagccag ggtgcctgat ctggacaaga 
ttggccagtt ctacttctta atccggaaga 
tcttctfctgt caacaacacc atccctccca 
acaatcatga ggaagactat tfctctgtatg 
aatgagtggt tggaagccca gcagatggga 
tgtgtgcgcg acatggggaa agagggtggc 
atctagaaac attacaccac acacaccgtc 
tttgctgctt cctcggccca gggagaaagc 
gatagaggaa tggggatgat gtaagtttac 
ttcatagatg ggtcaggagg tggacaagtt 
actccctgtg ctcccttctc tttgggcaga 
gtagggaaag gggacttgtg gtagtggacc 
attgatgcat tgtggcccct gatcttccct 
gcaatctcac tcagacatca cagtaccacc 
atttagacag ggatctctta cctttggaaa 
taagaagatg gttttgttat taaatagcat 
ccctgtctaa cctgctcttt ctctttggtg 
ataagtctca ggcatttcca attgtagact 
ttccatgtat caggacagag gtgtcttatg 
ttatatacta ctctcattca ggattcttgc 
gcacaggaat gctacatgat ggccagctgc 
gctagttaga aaggtttgga gggatgactt 
ttttcacttt tgggattttg tggggtggga 
catttcaatt catctctgct aatgaaaagg 
agttctgtca gctgcaagtt cttgtataat 
aataattgct taccttaaaa aaaaaaaaaa 



FIGURE 8E: Predicted amino acid sequence of human GABARAP like 1 (SEQ ID 
NO: 10) 

1 mkf qykedhp feyrkkegek irkkypdrvp vivekapkar vpdldkrkyl vpsdltvgqf 
61 yflirkrihl rpedalfffv nntipptsat mgqlyednhe edyflyvays desvygk 

FIGURE 8F: Predicted nucleotide sequence encoding human GABARAP like 2 (SEQ ID 
NO:ll) 

1 cgacagccgg aagtcccgcc tgccgtgtag tcgccgccgt cgctgccgct gccgctgccg 
61 ccgtcgttgt tgttgtgctc ggtgcgctga gctccgcggc tccgcgagcc ggttccgtcc 
121 ccttcccgcc gccgccatga agtggatgtt caaggaggac cactcgctgg aacacagatg 
181 cgtggagtcc gcgaagattc gagcgaaata tcccgacagg gttccggtga ttgtggaaaa 
241 ggtctcaggc tctcagattg ttgacattga caaacggaag tacttggttc catctgatat 
3 01 cactgtggct cagttcatgt ggatcatcag gaaaaggatc cagcttcctt ctgaaaaggc 
3 61 gatcttcctg tttgtggata agacagtccc acagtccagc ctaactatgg gacagcttta 
421 cgagaaggaa aaagatgaag atggattctt atatgtggcc tacagcggag agaacacttt 
481 tggcttctga gggccattgc tgggctaggt gcaccgtaac tgcttgtgta tcttgtaaat 
541 agccagccat tttcagttat tataccagaa cctcttcaca tagacctatt agtgcatttg 
601 taactggatt tatttcttaa tatattggaa ggttttgttt ccttagacta gtaaattatc 
661 atacagagtt ttattttgag tttttctttt tgtgcattgt cctcatgcct gtattctcca 
721 ggaaacttgt ccttctggaa atcatattga atgatatttc tatatcgaag tgaggtaggt 
781 gcggtattaa agtgaaaggg aaggtgatgc atttattctg ggttatgctt gaagtgttag 
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841 atggctaagt attaaaatta tccaaattaa atccttagca gtcagaacac ttgcttcact 
9 01 agaatatgcc aactgccaat catgttggac tgagctaafct tgttcctctt tctgaaacta 

i L ttaaggtaaa taattaacaa taaaaattct cttataaagg caaaaaaaaa aaaaaaaaaa 

1021 aaaaaaaaaa a 

FIGURE 8G: Predicted amino acid sequence of human GABARAP like 2 (SEQ ID 

1 mkwmfkedhs lehrcvesak irakypdrvp vivekvsgsq ivdidkrkyl vpsditvaaf 
61 mwnrkrxql psekaiflfv dktvpqsslt mgqlyekekd edgflyvays gentfgf 

Tn < ^JJvf^ H: Predicted nucleoti de sequence encoding human GABARAP like 3 (SEQ 

1 gaaaagccgc cggtatttct ccacctggct ctcctctacc tccaggcagg cgcacccgag 
61 gtccccctcc caccccacct tctgccctcc cgcacacttg gaccagtgct gttgacccgg 
121 aagcggacat ttctgcagct attctaagca cacgtcggcg gagggagcgg gacgtggcca 
181 gcggtcagcg gcgaaggagg caggccctgc gcggggatca cggaagccct gtgattcacc 
241 atgaagttcc agtacaagga ggtccatccc tttgagtatc ggaaaaagga aggagaaaag 
301 atccggaaga aatatccgga cagggtcccc ttgattgtag agaaggctcc aaaagcaagg 
361 fftgcctgatc tggacaggag gaagtaccta gtgccctccg accttaccga tggccagttc 
421 taccttttaa tccggaagag aatccacctg agacctgagg acgccttatt cttctttgtc 
481 aacaacacta tccctcccac tagtgctacc atgggccaac tatatgagga cagtcatgag 
541 gaagatgatt ttctgtatgt ggcctacagt aatgagagtg tctatgggaa atgagtggtt 
601 ggaagcccag cagatggaag cacctggact taggggtagg ggaggggtgt gtgtgtgact 
661 tggggaaaga gagggcggct cccaccgtga ggagacagaa ggtgaagaca tatagaaact 
721 ttacaccgca cacaccgtca acgcattttc acatgctcaa ctgatatttt ttgttgcttc 
781 cttggcccag ggagaaagca tgtcaggaca gagctgttgg attggctttg atagaggaat 
lol a?o^ g 0 taa " ttatg ^cattcctga gatttaattt ttgtgcagtt tcaLgaaag 
Q6 l SSoS ?? agg ^ ggacaa ffttggggtca gagatgatgg cagtccagca gcaactccct 
961 gtgctccctt ctctttgggc agagattctg tttttgacag ttgcacaaga caggtaggga 
loll o^ gg r Ctt gtggtagt ^ ^ccatacctg gggacgaaaa gagacccact gtfattgSg 
1081 catcgtggcc cccgatcttc cgtatcccac acttcttttc tcccatccca gttgcaatct 
} cactcacaaa catcacagta ccaccccagg ggcggcagta gacaccaacc cagaaattta 

llll llllllllll t 2 t ^ Ct " ggaaaatagg ffSTttaggcat gagggtggtt atgattaaga 
1261 agataatttt gttgttaaat agcattaaac tggaattgac agagtgagtt gagcatctct 
llll SST 3 ^f ttctcfc ctggtgctcc tcatctcacc cc t ac?t?gg aa^ttaataa 
III} S^ttcaggca tttccaattg cagactaaaa ccacttctac catctcctct agtattttcc 
Ittl a !? ta tCagg aca 9agatgt cttatgtagg gaaggggcag gtatgaagtg aggtagatta 
1501 tctatacctc tcactcattc aggattctcg ctcccatgct gctgtccctt cS?tc?caca 
Hoi = tCa = aggaa tgctatgtga tggccagctg cttcccttct tggttatcca ctgcagctgc 
1621 tagttagaaa ggtttgcagg gatgactttt agtaaatcat ggggatttta ttgatttaSt 
llll SSSiS** gga " ttgtg gg ^tgggagt ggggagcagg aattgcactc agacatgaca 
1 fini " t ^ aattca tctctgcaaa tgaaaagggt tcttcctctt gggggaaatc tgtgtgtcag 
1801 ttctgtcagc tgcaagttct tgtgtaatga agtcaatgct gtcaggccaa g 

FIGURE 81: Predicted amino acid sequence of human GABARAP like 3 (SEQ ID 
NO: 14) 

1 mkfqykevhp feyrkkegek irkkypdrvp livekapkar vpdldrrkyl vpsdltdggf 
&1 yllirkrxhl rpedalfffv nntipptsat mgqlyedshe eddflyvays nesvygk 
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FIGURE 9. CLUSTAL W (1.82) Protein Sequence Alignment Analysis 



GABARAP-13 Hs MKFQYKEVHPFEYRKKEGEKIRKKYPDRVPLIVEKAPKARVPDLDRRKYLVPSDLTDGQF 

GAB ARAP -11 HS MKFQYKEDHPFEYRKKEGEKIRKKYPDRVPVIVEKAPKARVPDLDKRKYLVPSDLTVGQF 

GABARAP Hs MKFVYKEEHPFEKRRSEGEKIRKKYPDRVPVIVEKAPKARIGDLDKKKYLVPSDLTVGQF 

CGI 534 Dm MKFQYKEEHAFEKRRAEGDKIRRKYPDRVPVIVEKAPKARIGDLDKKKYLVPSDLTVGQF 

CGI 2334 Dm 1XDSTYQYKKDHSFDKRRNEGDKIRRKYPDRVPVIVEKAPKTRYAELDKKKYLVPADLTVGQF 

GABARAP - 1 2 Hs MKWMFKEDHSLEHRCVESAKIRAKYPDRVPVIVEKVSGSQIVD1DKRKYLVPSDITVAQF 



GABARAP-13 Hs YLLIRKRIHLRPEDALFFFVlSnSITIPPTSATMGQLYEDSHEEDDFriYVAYSNESVYGK 

GABARAP -11 Hs YFIilRKRIHDRPEDALFFFVlSnsrTIPPTSATMGQLYEDNHEEDYFLYVAYSDESVYGK 

GABARAP Hs YFLIRKRIHLRAEDALFFFWNVIPPTSATMGQLYQEHHEEDFFLYIAYSDESVYGL 

CGI 5 3 4 Dm YFLIRKRIHLRPEDALFFFVNWIPPTSATMGSLYQEHHEEDYFLYIAYSDENVYGMAKI 
CG12334 Dm YFLIRKRII^RPDDALFFFVlXnWIPPTSATMGALYQEHFDKDYFLYISYTDENVYGRQ — 
GABARAP - 1 2 Hs MWI I RKRI QL PS EKAI FLFVDKTVPQS SLTMGQL YEKEKDEDGFLYVAYSGENTFGF 



GABARAP-13 Hs - 

GABARAP- 11 Hs - 
GABARAP HS 

CG1534 Dm N 
CG12334 Dm 

GABARAP- 12 Hs - 
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FIGURE 11. Triglyceride content of a CG10576 (Gadfly Accession Number) mutant 



1 80 
1 60 




EP-control EP(3)3271 
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FIGURE 13: HUMAN HOMOLOG OF CG10576 

FIGURE 13A. BLASTP result for CG10576 (Gadfly Accesssion Number) 
Homology to human PA2G4 (gene ref XMJ)49048; protein ref XPJ)49048.1) 

ref |XP_049048.l| (XM.049048) proliferation-associated 2G4, 38kD [Homo 
sapiens] 

sp | Q9UQ80 | P2G4_HUMAN PROLIFERATION- ASSOCIATED PROTEIN 2G4 (CELL CYCLE 
PROTEIN P38-2G4 HOMOLOG) (HG4-1) 

gb|AAD05561.l) (AF104670) cell cycle protein [Homo sapiens] 

gb|AAH01951.l|AAH01951 (BC001951) proliferation-associated 2G4, 38kD [Homo 
sapiens] gb | AAH07561 . 1 | AAH07561 (BC007561) Unknown (protein for MGC:15488) 
[Homo sapiens] , Length =394 

Score = 425 bits (1081) , Expect = e-118 

Identities = 212/386 (54%), Positives = 276/386 (70%), Gaps = 3/386 (0%) 

Query: 1 MADVEKEPEKTIAEDLVVTKYKLAGEIVMK^ 60 

M+ ++ + E+TIAEDLWTKYK+ G+I N+ L++++ SV +C +GD + EET 

Sbjct: 1 MSGEDEQQEQTIAEDLWTKYKMGGDIANRVLRSLVEASSSGVSVLSLCEKGDAMIMEET 60 

Query: 61 GKVYKKEKDLKKGIAFPTCLSVNNCVCHFSPAKNDADYTLKAGDVVKIDLGAHIDGFIAV 120 

GK+ +KKEK+ +KKGIAFPT + SVNNCVCHFSP K+D DY LK GD+VKIDLG H+DGFIA 
Sbjct: 61 GKIFKKEKEMKKGIAFPTSISVNNCVCHFSPLKSDQDYILKEGDLVKIDLGVHVDGFIAN 120 

Query: 121 AAHTIWGAAADQKISGRQADVILAAYWAVQAALRLLKSGANNYSLTDAVQQISESYKCK 180 

AHT W A +++GR+ADVI AA+ +AALRL+K G N +T+A + + + S+ C 
Sbjct: 121 VAHT FWDVAQGT Q VT GRKADVI KAAHL C AEAALRLVK PGNQNT QVT E AWNKVAHS FNC T 180 

Query: 181 PIEGMLSHELKQFKIDGEKTIIQNPSEAQRKEHEKCTFETYEVYAIDVIVSTGEGVGREK 240 

PIEGMLSH+LKQ ( IDGEKTIIQNP++ Q+K+HEK FE + EVYA+ DV+VS + GEG + + 
Sbjct: 181 PIEGMLSHQLKQHVIDGEKTIIQNPTDQQKKDHEKAEFEVHEVYAVDVLVSSGEGKAKDA 240 

Query: 241 DTKVSIYKKS-EENY^KMKASRALLAEVKTKYGN^ 299 

+ +IYK+ + Y LKMK SRA +EV+ ++ MPF +R+FE+E KARMGWEC H+ 
Sbjct: 241 GQRTTIYKRDPSKQYGLKmTSRAFFSEVERRFDAMPFTLRAFEDEKKARMGWECAKHE 300 

Query: 3 00 MIEPFQVLYEKPSEIVAQFKHTVLLMPNGVI^LVTGIPFEAENYVSEYSVAQEELKTLLAQ 359 

+ + + PF VLYEK E VAQFK TVLLMPNG +T PFE + Y SE V ELK LL 
Sbjct: 3 01 LLQPFNVLYEKEGEFVAQFKFTVLLMPNGPMRITSGPFEPDLYKSEMEVQDAELKALLQS 3 60 

Query: 3 60 PLGPVKGKGKGKKA — TAGAATKVET 3 83 . 

K K KKA TA AT ET 
Sbjct: 3 61 SASRKTQKKKKKKASKTAENATSGET 3 86 
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FIGURE 13B: Predicted nucleotide sequence encoding human proliferation associated 
protein 2G4 (SEQ ID NO: 15) 



l 
61 
121 
181 
241 
301 
361 
421 
481 
541 
601 
661 
721 
781 
841 
901 
961 
1021 
1081 
1141 
1201 
1261 
1321 
1381 
1441 
1501 
1561 
1621 
1681 



ggatcgaggg 
tcaggggaaa 
ggagcaaact 
caacagggta 
gtgtgagaaa 
ggaaatgaag 
cttctcccct 
tgaccttggg 
tgtagctcag 
ttgtgctgaa 
agcctggaac 
ccagttgaag 
gcagaagaag 
tctcgtcagc 
acgagacccc 
ggtggaaagg 
ggctcggatg 
ctatgagaag 
tggccccatg 
ggtccaggat 
aaagaaaaaa 
agaaaatgaa 
tccccttccc 
agcagagcgg 
acaaccagct 
ctttaaacga 
ggcccctctt 
agaacagggc 
tgacgtcagc 



gactctgacc 
cgaggctgca 
atcgctgagg 
cttcggtcct 
ggtgatgcca 
aaaggtattg 
ttgaagagcg 
gtccatgtgg 
gggacccaag 
gctgccctac 
aaagttgccc 
cagcatgtca 
gaccatgaaa 
tcaggagagg 
tctaaacagt 
cgttttgatg 
ggtgtggtgg 
gagggtgaat 
cggataacca 
gcagagctaa 
aagaaggcct 
gctggggact 
accaaacccc 
ggggatctcc 
ccaactgact 
aaaaaagaaa 
tctagccttt 
taaattagcc 
atttttt 



acagcctgtg 
gtggtggtag 
acctggtcgt 
tggtggaagc 
tgattatgga 
cttttcccac 
accaggatta 
atggcttcat 
taacagggag 
gcctggtcaa 
actcatttaa 
tcgatggaga 
aagctgaatt 
gcaaggccaa 
atggactgaa 
ccatgccgtt 
agtgcgccaa 
ttgttgccca 
gtggtccctt 
aggccctcct 
ccaagactgc 
gaggtgcgtc 
agactctgtg 
ctgcccccac 
ctggtcttgg 
ttgaataata 
tctactactc 
accaccactg 



gctgggaagg 
taggaagatg 
gaccaagtat 
atctagctca 
agaaacaggg 
cagcatttcg 
tattctcaag 
cgctaatgta 
gaaagcagat 
acctggaaat 
ctgcacgcca 
aaaaaccatt 
tgaggtacat 
ggatgcagga 
aatgaaaact 
tactttaaga 
acatgaactg 
gtttaaattt 
cgagcctgac 
ccagagttct 
agagaatccc 
ccatctcccc 
aagtgcagtt 
cccagttccc 
gaggtgaggc 
aaatcaggag 
tctgcttggt 
aaaactcagc 



gagacagagg 
tcgggcgagg 
aagatggggg 
ggtgtgtcgg 
aaaatcttca 
gtaaataact 
gaaggtgact 
gctcacactt 
gttattaagg 
cagaacacac 
atagaaggta 
atccagaatc 
gaagtatatg 
cagagaacca 
tcacgtgcct 
gcatttgaag 
ctgcaaccat 
acagttctgc 
ctctacaagt 
gcaagtcgaa 
accagtgggg 
agcttgctgc 
cttctccacc 
caacccactc 
ttcccaacca 
tcaaaattca 
caaggtttgt 
cgaatttttt 



cggcggcggc 
acgagcaaca 
gcgacatcgc 
tactcagcct 
agaaagaaaa 
gtgtatgtca 
tggtaaaaat 
ttgtggttga 
cagctcacct 
aagtgacaga 
tgctgtcaca 
ccacagacca 
ctgtggatgt 
ctatttacaa 
tcttcagtga 
atgagaagaa 
ttaatgttct 
tcatgcccaa 
ctgagatgga 
aaacccagaa 
aaacattaga 
tcctgcctca 
taggaccgcc 
ccttccaaca 
cggaagacta 
tcgtcttcaa 
gccccactac 
tataccactc 



FIGURE 13C: Predicted amino acid sequence of human human proliferation associated 
protein 2G4 (SEQ ID NO: 16) 

1 msgedeqqeq tiaedlwtk ykmggdianr vlrslveass sgvsvlslce kgdamimeet 
61 gkifkkekem kkgiafptsi svnncvchfs plksdqdyil kegdlvkidl gvhvdgfian 
121 vahtfwdva qgtqvtgrka dvikaahlca eaalrlvkpg nqntqvteaw nkvahsfnct 
181 piegmlshql kqhvidgekt iiqnptdqqk kdhekaefev hevyavdvlv ssgegkakda 
241 gqrttiykrd pskqyglkmk tsraffseve rrfdampftl rafedekkar mgwecakhe 
3 01 llqpfnvlye kegefvaqfk ftvllmpngp mritsgpfep dlyksemevq daelkallqs 
3 61 sasrktqkkk kkkasktaen ptsgetleen eagd 
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FIGURE 14. CLUSTAL W (1.7) Protein Sequence Alignment Analysis 



CG10576 
XP_049048 .1 



MADVEKEPEKTIAEDLVVTKYKLAGEIWIKTLKAVIGLCVVDASVREICT 
MSGEDEQQEQTIAEDLWTKYKMGGDIAlSrRVLRSLVEASSSGVSVLSLCEKGDAMIMEET 



*.************. *.* * . * . . . . 



* * . * . * * 



CG10576 
XP__049048.1 



GKVYKKEKDLKKGIAFPTCLSVJSnKTCVCHFSPAKNDADYTLKAGDWKIDLGAHIDG 
GKIFKKEKEMKKGIAFPTSISVNNCVCHFSPLKSDQDYILKEGDLVKIDLGVHVDGFIAN 
**..****.. ******** ************ * * ** ** ********* ******* 



CG10576 
XP_049048 .1 



AAHT I WGAAADQKI SGRQADVI LAAYWAVQAALRLLKS GANNYS LTDAVQQI SES YKCK 
VAHTFVVDVAQGTQVTGRKADVIKAAHLCAEAAIjRLVKPGNQOT 



*** . ** * 



, * * . * * * * * * . 



>******* * * * • * m * • • • * * • • * 



CG10576 
XP_049048 .1 



PIEGMLSHELKQFKIDGEKTIIQNPSEAQRKEHEKCTFETYEVYAIDVIVSTGEGVGREK 
PIEGMLSHQLKQHVIDGEKTIIQNPTDQQKKDHEKAEFEVHEVYAVDVLVSSGEGKAKDA 
************ ************* ******* ** ***********.*** 



CG10576 
XP_049048 .1 



DTKVSIYKKSE-ENYML]mKASRALLAEW^ 

GQRTTIYKRDPSKQYGLKMKTSRAFFSEVERRFDAMPFTLRAFEDEKKARMGWECAKHE 



*** ******* ********* *. 



CG10576 
XP_049048 .1 



MI EPFQVLYEKPSEIVAQFKHTVLLMPNGV1SILVTGI PFEAENYVSEYSVAQEELKTLLAQ 

LLQPFIST^YEKEGEFVAQFKFTVLLMPNGPMRITSGPFEPDLYKSEMEVQDAELKALLQS 
...**.***** ******* ******** .* *** . * ** * . ****** 



CG10576 
XP_049048 .1 



PIiGPVKGKGKGKKATAGAATKVETAPAVETKA — 
SASRKTQKKKKKKASKTAENATSGETLEENEAGD 



WO 03/066086 



PCT/EP03/01094 



26/61 



GO 



IT) 



3 





Aeup!>| 



ueejds 



Bunj 



eujiseiui iibujs 



UO|00 



o 


o 


o 


o 


o 


o 


o 


o 


o 


q 


o_ 


o 


o 


CO 


o 


CD 


o 


o 


CD 


CD 


o 


o~ 


o 


o" 


o 


o 


O 


O 


o 


o 


o 




o 


o 


o 


O 


o 


o 


o 




I s - 


CO 


LO 




CO 


CvJ 







uojssaidxg VNfci "leu 



WO 03/066086 



PCT/EP03/01094 



27/61 



P 




mojjbui auoq 
Asupi>| 
ueeids 
6un| 

euiisejU! hblus 

UO|00 

uiBjq 

snuiB|BijiodAi| 

SBOJOUBd 
JOAH 

eiosnoi 

±va 

1VM 



ooooooooo 

O LO O LOO LOOLOO 
■sF CO CO c\T CM i— t-~ o o" 

uojssaidxa VNH 'l©H 



WO 03/066086 



PCT/EP03/01094 



28/61 




WO 03/066086 



PCT/EP03/01094 



29/61 




WO 03/066086 



PCT/EP03/01094 



30/61 



FIGURE 18: HUMAN HOMOLOG OF CG7858 (Mocsl) 

FIGURE 18A. BLAST? search results for Mocsl (Gadfly Accession Number CG7858) 

gb | AAB87523 . 1 | (AF034374) molybdenum cof actor biosynthesis protein A [Homo 
sapiens] Length = 385 

Score = 445 bits (1132), Expect = e-123 

Identities = 214/351 (60%), Positives = 274/351 (77%), Gaps = 7/351 (1%) 

Query: 41 ATASVQPLEPEKQVLRKNSP LTDS FGRHHT YLRI SLTERCNLRCDYCMPAEGVPL 95 

A A+ + + +Q LR++ + LTDSFGR H+YLRISLTE+CNLRC YCMP EGVPL 

Sb j c t : 3 6 ARAASEWSRRRQFLREHAAPFSAFLTDSFGRQHSYLRISLTEKCNLRCQYCMPEEGVPL 9 5 

Query: 96 QPKNKLLTTEEILRLAR 155 

PK LLTTEEIL LAR+FV++G+ KIRLTGGEP +R D+V+IVAQ+ + L L IG+TT 
Sbjct: 96 TPKANLLTTEEILTLARLFVKEGIDKIRLTGGEPLIRPDWDIVAQLQRLEGLRTIGVTT 155 

Query: 156 NGLVLTRLLLPLQRAGLDNLNISLDTLKRDRFEKITRRKGWERVIAGIDLAVQLGYRP-K 214 

NG+ L RLL LQ+AGL +NISLDTL +FE I RRKG+ +V+ GI A++LGY P K 
Sbjct: 156 NGINLARLLPQLQKAGLSAINISLDTLVPAKFEFIVRRKGFHKVMEGIHKAIELGYNPVK 215 

Query: 215 VNCVLMRDFNEDEICDFVEFTRJ^PVDVRFIEYMPFSGNK^ 274 

WCV+MR NEDE+ DF T P+DVRFIEYMPF GNKW+ ++++SYK+ L +RQ+W 
Sbjct: 216 WCVVMRGLNEDELLDFAALTEGHPLDVRFIEYMPFDGNK^ 275 

Query: 275 PDFKALPNGPNDTSKAYAVPGFKGQV^^ 334 

P+ + +P + T+KA+ +PGF+GQ+ FITSM+EHFCGTCNRLR+TADGN+KVCLFGN E 
Sb j c t : 276 PELEKVPEEES STAKAFKI PGFQGQI SFXTSMS EHFCGTCNRLRITADGNLKVCLFGNSE 335 

Query: 335 F S LRD AMRDES VS EE QLVDL I GAAVQRXKKQHAGMLNLS QMENRPMI LI GG 385 

SLRD +R SE++L+ +IGAAV RKK+QHAGM + + S QM+NRPMI L I GG 

Sbjct: 33 6 VSLRDHLR- AGASEQELLRI IGAAVGRKKRQHAGMFS I S QMKNRPMI LI GG 385 



ref 
emb 
emb 



XP_046687 . 1 1 (XM_046687) molybdenum cofactor synthesis 1 [Homo sapiens] 



CAA11897.1 
CAC44527 .1 



(AJ224328) MOCS1A protein [Homo sapiens] 
(AJ293 577) MOCS1A enzyme [Homo sapiens] Length =3 85 



Score = 444 bits (1129), Expect = e-123 

Identities = 214/348 (61%), Positives = 272/348 (77%), Gaps = 7/348 (2%) 

Query: 44 SVQPLEPEKQVLRKNSP LTDS FGRHHT YLRI SLTERCNLRCDYCMPAEGVPLQPK 98 

S Q + +Q LR+ + + LTDSFGR H+YLRISLTE+CNLRC YCMP EGVPL PK 

Sbjct: 39 SSQEVSRRRQFLREHAAPFSAFLTDSFGRQHSYLRISLTEKCNLRCQYCMPEEGVPLTPK 98 

Query: 99 NKLLTTEEILRLARIFVEQGVRKIRLTGGEPTVRRDIVF^ 158 

LLTTEEIL LAR+FV++G+ KIRLTGGEP +R D+V+IVAQ+ + L L IG+TTNG+ 
Sbjct: 99 AKLLTTEEILTLARLFVKEGIDKIRLTGGEPLIRPDVVDIVAQLQRLEGLRTIGVTTNGI 158 

Query: 159 Vl^TRLLLPLQRAGLDNLNISLDTLKRDRFEKITRRKGWERVIAGIDLAVQLGYRP-KVNC 217 

L RLL LQ+AGL +NISLDTL +FE I RRKG+ +V+ GI A++LGY P KVNC 
Sbjct: 159 NLARLLPQLQI^GLSAINISLDTLVPAKFEFIVRRKGFHKVMSGIHKAIELGYINJPVKVNC 218 



Query: 218 VLKRDFNEDEICDFVEFTRKRPVDVRFIEYMPFSGNKWHTERLISYKDTLQIIRQRWPDF 277 
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V+MR NEDE+ DF T P+DVRFIEYMPF GNKW+ + +++SYK+ L +RQ+WP+ 
Sbjct: 219 WMRGLNEDELLDFAALTEGLPLDWFIEYM^ 278 

Query: 278 KALPNGPNDTSKAYAVPGFKGQVGFITSMTEHFCGTCI^^ 337 

+ +P + T+KA+ +PGF+GQ+ FITSM+EHFCGTCNRLR+TADGN+KVCLFGN E SL 
Sbjct: 279 EKVPEEESSTAKAFKIPGFQGQISFITSMSEHFCGTCNRLRITADGNLKVCLFGNSEVSL 338 

Query: 338 RDAMRDESVSEEQLVDLIGAAVQRKKK^ 385 

RD +R SE++L+ +IGAAV RKK+QHAGM ++SQM+NRPMILIGG 

Sbjct: 339 RDHLR- AGASEQELLRI IGAAVGRKKRQHAGMFS I SQMKNRPMILIGG 385 



cccccttctc 
ccctcacaga 
tgacccccaa 
tgaaggaagg 
tggtggacat 
ccaatggcat 



FIGURE 18B: Predicted nucleotide sequence encoding human molybdenum cofactor 
biosynthesis protein A and C (SEQ ID NO:17) 

1 cgctcgtatc aggcttcatg gcggcgcggc cactgtcccg gatgctgcgg cggcttctga 
ggtccagcgc ccggagctgc agctcagggg ctccggtgac ccagccctgc cccggggagt 
ccgcgcgagc tgcctcggag gaggtgtcca ggcggaggca gttcctgcgg gagcatgcgg 
cgccttcctc acagacagct tcggccggca gcacagctac ctgcggatct 
gaagtgcaac ctcagatgtc agtactgcat gcccgaggag ggggtcccgc 
agccaacctg ctgaccacag aggagatcct gaccctcgcc cggctctttg 
catcgacaag atccggctca caggtggaga gccgcttatc cggccggacg 
tgtggcccag ctccagcggc tggaagggct gagaaccata ggtgttacca 
caacctggcc cggctactgc cccagcttca gaaggctggt ctcagtgcca 
tcaacatcag cctggacacc ctggtgcctg ccaagtttga gttcattgtc cgcaggaaag 
gcttccacaa ggtcatggag ggcatccaca aggccatcga gctgggctac aaccctgtga 
aggtgaactg tgtggtgatg cgaggcctta acgaggatga actcctggac tttgcggcct 
ccaccccctg gatgtgcgct tcatagagta tatgcccttt gatggcaaca 
caagaagatg gtcagctata aggagatgct agacactgtc cggcagcagt 
ggagaaggtg ccagaggagg aatccagcac agccaaggcc tttaaaatcc 
aggccagatc agcttcatca catccatgtc tgagcatttc tgtgggacct 
gcgaatcaca gctgatggga acctcaaggt ctgcctcttt ggaaactctg 
gcgggatcac ctgcgagctg gggcctctga gcaggagctg ctgagaatca 
tgtgggcagg aagaagcggc agcatgcagg catgttcagt atttcccaga 
gcccatgatc ctcatcggtg ggtgacccat caagttattt ttgatgttcc 
accagccaat ccaagcattt tctcctggga cccgctccat gttcagggtc 
aatgagtttc tccagccagg 
tcctctagcc cagcagcggc 
agactcagat gccaactcaa 
aggaccccag ctaacctcag 
ggtagatgtg ggcaggaagc 
cctgggaccg gtagccttca 



61 
121 
181 
241 
301 
361 
421 
481 
541 
601 
661 
721 
781 
841 
901 
961 
1021 
1081 
1141 
1201 
1261 
1321 
1381 
1441 
1501 
1561 
1621 
1681 
1741 
1801 
1861 
1921 
1981 
2041 
2101 
2161 
2221 
2281 
2341 
2401 
2461 



tgactgaggg 
agtggaactt 
ggccagagct 
ctggcttcca 
gcaaccgcct 
aggtatccct 
ttggggctgc 
tgaagaaccg 
ccaattcccc 
taagacccag 
cccagacccc 
cttcccgtgc 
ctgccccctc 
gggcagctat 
ccgtggtcct 
gagatgccct 



tggccacttt atggaaagga tgcagggtcc 
tggggtctgg ctcctttcag agacactaca 
agtgccttag cccaggttcc tgggcttctg 
aacaactaac tcatgtggac tcggaaggac 
cagacacaga gcgggtggct gtggcttcag 
agcttgtcca gcagaaccag ctcaagaaag 



agtggtggcc cagctggctg gagtccaggc agccaaggtg accagccagc 



tgatccctct gtgccaccac gtggccctga gccacatcca ggtgcagctg gagctggaca 



gcacacgcca tgccgtgaag atccaggcat 

agatggaggc cctgacctct gctgcagtgg 

ctgtcagcag ggacatcgtg ttggaggaga 

ggggggactt ccatcgggct tagcacctgc 

agctgggatg caatgtaggc tgagggaaag 

ttgtttacct tgagcagtaa acccgaagtc 

ctgctagatg atctctaatg accaatgggg 

cccttaagcc ttccaggaca ctaaggtcgt 



cttgccgggc tcggggcccc accggggtgg 
ccgccctcac cctgtatgac atgtgcaagg 
tcaagctcat tagcaagact ggtggtcagc 
ccttctcacc catggcccac ccaggcctgg 
acgtcaggtt cctttaatca cagtcactgt 
agcctgctct actactaaca aacaggcctg 
cttcctttct atagggagga taccagcagg 
gggagcggga ctgcaacaag caatgccaga 



taactgagaa atcatgttct ttgtggacta tttcagacaa ccaggttccg acagtccagc 

ccagaacttt tccttctcat tttgggtttt ctcttctcct gctttcctgg ggagagatta 

agcgctcatt aagcagagga gcccactttg aggagagcaa agcacaagct tgcttgaaga 

atggatccca acttctcccc ggcagctctg cctccctaag tctgtgaagc cgcagccctg 

ccctgtcctg tcctgtcctg acttcatctc tccttctgcc caagtctgtg tcccatcaga 
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2521 cttgcagcct ttcagcttaa cagttgcccg 
2581 ctcttctgaa acaggatgtg cacacatggg 
2641 acagcccaca cctggccctg ttcacggctg 
2701 atcagggaaa gaaaagttga tgatagattg 
27 61 acgatgccaa ctttgcagat gcaagaatga 
2821 tgcagtttca cagggaaggc tgcacctatc 
2881 agttccctga gggaagaaga ggagggacct 
2941 gccccgctat gggaaaataa agtggagtag 
3 001 aaaaaaaaaa aaa 



gtcctgctgg ccccttttcc tctggccccc 
ccatagccct aaggactcct gccagaccac 
ttccacccac ccctctttat tctggagcat 
ccttcaccct cacagcgcac aaataaagct 
agacactgtg tgggtagggc actgagctgc 
aatcaatcaa tcaatcctat cccaagacac 
ggaaaggcct aagggtgtac tctctgtata 
ggggcataga aaaaaaaaaa aaaaaaaaaa 



FIGURE 18C: Predicted amino acid sequence of human molybdenum cofactor 
biosynthesis protein A (SEQ ID NO:18) 

1 maarplsrml rrllrssars cssgapvtqp cpgesaraas eevsrrrqfl rehaapfsaf 
61 ltdsfgrqhs ylrisltekc nlrcqycmpe egvpltpkan lltteeiltl arlfvkegid 
121 kirltggepl irpdwdiva qlqrleglrt igvttnginl arllpqlqka glsainisld 
181 tlvpakfefi vrrkgfhkvm egihkaielg ynpvkvncw mrglnedell dfaalteghp 
241 Idvrf ieymp fdgnkwnfkk mvsykemldt vrqqwpelek vpeeesstak afkipgfqgq 
3 01 isfitsmseh. fcgtcnrlri tadgnlkvcl fgnsevslrd hlragaseqe llriigaavg 
3 61 rkkrqhagmf sisqmknrpm iligg 



FIGURE 18D: Predicted nucleotide sequence encoding human MOCS1 protein, isoform 
1 (SEQ ID NO: 19) 

1 gccagaaatc ttcccagtag agatcaccat ccgcccccga cccccaagct gaatacttaa 
61 ggggtgggtc cttcccatca agctgatttc tcaacgagag ggacaatccc agcttcccca 
121 acattgcaga gcccaaacat gtggaagagt tggaagctcc gcacagatgt cagagtaagg 
181 gagggggcag gcggttctcc ttgtgcctct tcccagcccg gtagcagggg cccatgcttc 
241 ctccctggtc tgtcctcgca ggaggtgtcc aggcggaggc agttcctgcg ggagcatgcg 
3 01 gcccccttct ccgccttcct cacagacagc ttcggccggc agcacagcta cctgcggatc 
3 61 tccctcacag agaagtgcaa cctcagatgt cagtactgca tgcccgagga gggggtcccg 
421 ctgaccccca aagccaacct gctgaccaca gaggagatcc tgaccctcgc ccggctcttt 
481 gtgaaggaag gcatcgacaa gatccggctc acaggtggag agccgcttat ccggccggac 
541 gtggtggaca ttgtggccca gctccagcgg ctggaagggc tgagaaccat aggtgttacc 
601 accaatggca tcaacctggc ccggctactg ccccagcttc 
661 atcaacatca gcctggacac cc tggtgcct gccaagtttg 
721 ggcttccaca aggtcatgga gggcatccac aaggccatcg 
781 aaggtgaact gtgtggtgat gcgaggcctt aacgaggatg 
841 ttgactgagg gcctccccct ggatgtgcgc ttcatagagt 

901 aagtggaact tcaagaagat ggtcagctat aaggagatgc tagacactgt ccggcagcag 
961 tggccagagc tggagaaggt gccagaggag gaatccagca cagccaaggc ctttaaaatc 
1021 cctggcttcc aaggccagat cagcttcatc acatccatgt ctgagcattt ctgtgggacc 
1081 tgcaaccgcc tgcgaatcac agctgatggg aacctcaagg tctgcctctt tggaaactct 
1141 gaggtatccc tgcgggatca cctgcgagct ggggcctctg agcaggagct gctgagaatc 
1201 attggggctg ctgtgggcag gaagaagcgg cagcatgcag gcatgttcag tatttcccag 
1261 atgaagaacc ggcccatgat cctcatcggt gggtgaccca tcaagttatt tttgatgttc 
1321 cccaattccc caccagccaa tccaagcatt ttctcctggg acccgctcca tgttcagggt 
1381 ctaagaccca gaatgagttt ctccagccag gtggccactt tatggaaagg atgcagggtc 
1441 ccccagaccc ctcctctagc ccagcagcgg ctggggtctg gctcctttca gagacactac 
1501 acttcccgtg cagactcaga tgccaactca aagtgcctta gcccaggttc ctgggcttct 
1561 gctgccccct caggacccca gctaacctca gaacaactaa ctcatgtgga ctcggaagga 
1621 cgggcagcta tggtagatgt gggcaggaag ccagacacag agcgggtggc tgtggcttca 
1681 gccgtggtcc tcctgggacc ggtagccttc aagcttgtcc agcagaacca gctcaagaaa 
1741 ggagatgccc tagtggtggc ccagctggct ggagtccagg cagccaaggt gaccagccag 
1801 ctgatccctc tgtgccacca cgtggccctg agccacatcc aggtgcagct ggagctggac 



agaaggctgg tctcagtgcc 
agttcattgt ccgcaggaaa 
agctgggcta caaccctgtg 
aactcctgga ctttgcggcc 
atatgccctt tgatggcaac 
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1861 
1921 
1981 
2041 
2101 
2161 
2221 
2281 
2341 
2401 
2461 
2521 
2581 
2641 
2701 
2761 
2821 
2881 



agcacacgcc 
gagatggagg 
gctgtcagca 
cggggggact 
gagctgggat 
tttgtttacc 
gctgctagat 
gcccttaagc 
ataactgaga 
cccagaactt 
aagcgctcat 
aatggatccc 
gccctgtcct 
acttgcagcc 
cctcttctga 
cacagcccac 
tatcagggaa 
tacgatgcca 



atgccgtgaa 
ccctgacctc 
gggacatcgt 
tccatcgggc 
gcaatgtagg 
ttgagcagta 
gatctctaat 
cttccaggac 
aatcatgttc 
ttccttctca 
taagcagagg 
aacttctccc 
gtcctgtcct 
tttcagctta 
aacaggatgt 
acctggccct 
agaaaagttg 
actttgc 



gatccaggca 
tgctgcagtg 
gttggaggag 
ttagcacctg 
ctgagggaaa 
aacccgaagt 
gaccaatggg 
actaaggtcg 
tttgtggact 
ttfctgggttt 
agcccacttt 
cggcagctct 
gacttcatct 
acagttgccc 
gcacacatgg 
gttcacggct 
atgatagatt 



tcttgccggg 
gccgccctca 
atcaagctca 
cccttctcac 
gacgtcaggt 
cagcctgctc 
gcttcctttc 
tgggagcggg 
atttcagaca 
tctcttctcc 
gaggagagca 
gcctccctaa 
ctccttctgc 
ggtcctgctg 
gccatagccc 
gttccaccca 
gccttcaccc 



ctcggggccc 
ccctgtatga 
ttagcaagac 
ccatggccca 
tcctttaatc 
tactactaac 
tatagggagg 
actgcaacaa 
accaggttcc 
tgctttcctg 
aagcacaagc 
gtctgtgaag 
ccaagtctgt 
gccccttttc 
taaggactcc 
cccctcttta 
tcacagcgca 



caccggggtg 
catgtgcaag 
tggtggtcag 
cccaggcctg' 
acagtcactg 
aaacaggcct 
ataccagcag 
gcaatgccag 
gacagtccag 
gggagagatt 
ttgcctgaag 
ccgcagccct 
gtcccatcag 
ctctggcccc 
tgccagacca 
ttctggagca 
caaataaagc 



FIGURE 18E: Predicted amino acid sequence of human MOCS1 protein, isoform 1 
(SEQIDNO:20) 

1 mwkswklrtd vrvregaggs pcassqpgsr gpcflpglss qevsrrrqfl rehaapfsaf 
61 Itdsfgrqhs ylrisltekc nlrcgycmpe egvpltpkan lltteeiltl arlfvkegid 
121 kirltggepl irpdwdiva qlqrleglrt igvttnginl arllpqlqka glsainisld 
181 tlvpakfefi vrrkgf hkvm egihkaielg ynpvkvncw mrglnedell dfaalteglp 
241 Idvrfieymp fdgnkwnfkk mvsykemldt vrqqwpelek vpeeesstak afkipgfqgq 
3 01 isfitsmseh. fcgtcnrlri tadgnlkvcl fgnsevslrd hlragaseqe llriigaavg 
361 rkkrqhagmf sisqmknrpm iligg 



FIGURE 18F: Predicted nucleotide sequence of human MOCS1, isoform 2 protein 
(SEQIDNO:21) 

1 gccagaaatc ttcccagtag agatcaccat ccgcccccga cccccaagaa tacttaaggg 
61 gtgggtcctt cccatcaagc tgatttctca acgagaggga caatcccagc ttccccaaca 
121 ttgcagagcc caaacatgtg gaagagttgg aagctccgca cagatgtcag agtaagggag 
181 ggggcaggcg gttctccttg tgcctcttcc cagcccggta gcaggggccc atgcttcctc 
241 cctggtctgt cctcgcagga ggtgtccagg cggaggcagt tcctgcggga gcatgcggcc 
3 01 cccttctccg ccttcctcac agacagcttc ggccggcagc acagctacct gcggatctcc 
3 61 ctcacagaga agtgcaacct cagatgtcag tactgcatgc ccgaggaggg ggtcccgctg 
421 acccccaaag ccaacctgct gaccacagag gagatcctga ccctcgcccg gctctttgtg 
481 aaggaaggca tcgacaagat ccggctcaca ggtggagagc cgcttatccg gccggacgtg 
541 gtggacattg tggcccagct ccagcggctg gaagggctga gaaccatagg tgttaccacc 
601 aatggcatca acctggcccg gctactgccc cagcttcaga aggctggtct cagtgccatc 
661 aacatcagcc tggacaccct ggtgcctgcc aagtttgagt tcattgtccg caggaaaggc 
721 ttccacaagg tcatggaggg catccacaag gccatcgagc tgggctacaa ccctgtgaag 
781 gtgaactgtg tggtgatgcg aggccttaac gaggatgaac tcctggactt tgcggccttg 
841 actgagggcc tccccctgga tgtgcgcttc atagagtata tgccctttga tggcaacaag 
9 01 tggaacttca agaagatggt cagctataag gagatgctag acactgtccg gcagcagtgg 
9 61 ccagagctgg agaaggtgcc agaggaggaa tccagcacag ccaaggcctt taaaatccct 
1021 ggcttccaag gccagatcag cttcatcaca tccatgtctg agcatttctg tgggacctgc 
1081 aaccgcctgc gaatcacagc tgatgggaac ctcaaggtct gcctctttgg aaactctgag 
1141 gtatccctgc gggatcacct gcgagctggg gcctctgagc aggagctgct gagaatcatt 
1201 ggggctgctg tgggcaggaa gaagcggcag catgcaggca tgttcagtat ttcccagatg 
1261 aagaaccggc ccatgatcct catcaagtta tttttgatgt tccccaattc cccaccagcc 
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1321 
1381 
1441 
1501 
1561 
1621 
1681 
1741 
1801 
1861 
1921 
1981 
2041 
2101 
2161 
2221 
2281 
2341 
2401 
2461 
2521 
2581 
2641 
2701 
2761 
2821 
2881 



aatccaagca 
ttctccagcc 
gcccagcagc 
gatgccaact 
cagctaacct 
gtgggcagga 
ccggtagcct 
gcccagctgg 
cacgtggccc 
aagatccagg 
tctgctgcag 
gtgttggagg 
gcttagcacc 
ggctgaggga 
taaacccgaa 
atgaccaatg 
acactaaggt 
tctttgtgga 
cattttgggt 
ggagcccact 
cccggcagct 
ctgacttcat 
taacagttgc 
gtgcacacat 
ctgttcacgg 
tgatgataga 
a 



ttttctcctg 
aggtggccac 
ggctggggtc 
caaagtgcct 
cagaacaact 
agccagacac 
tcaagcttgt 
ctggagtcca 
tgagccacat 
catcttgccg 
tggccgccct 
agatcaagct 
tgcccttctc 
aagacgtcag 
gtcagcctgc 
gggcttcctt 
cgtgggagcg 
ctatttcaga 
tttctcttct 
ttgaggagag 
ctgcctccct 
ctctccttct 
ccggtcctgc 
gggccatagc 
ctgttccacg 
ttgccttcac 



ggacccgctc 
tttatggaaa 
tggctccttt 
tagcccaggt 
aactcatgtg 
agagcgggtg 
ccagcagaac 
ggcagccaag 
ccaggtgcag 
ggctcggggc 
caccctgtat 
cattagcaag 
acccatggcc 
gttcctttaa 
tctactacta 
tctataggga 
ggactgcaac 
caaccaggtt 
cctgctttcc 
caaagcacaa 
aagtctgtga 
gcccaagtct 
tggccccttt 
cctaaggact 
cacccctctt 
cctcacagcg 



catgttcagg 
ggatgcaggg 
cagagacact 
tcctgggctt 
gactcggaag 
gctgtggctt 
cagctcaaga 
gtgaccagcc 
ctggagctgg 
cccaccgggg 
gacatgtgca 
actggtggtc 
cacccaggcc 
tcacagtcac 
acaaacaggc 
ggataccagc 
aagcaatgcc 
ccgacagtcc 
tggggagaga 
gcttgcctga 
agccgcagcc 
gtgtcccatc 
tcctctggcc 
cctgccagac 
tattctggag 
cacaaataaa 



gtctaagacc 
tcccccagac 
acacttcccg 
ctgctgcccc 
gacgggcagc 
cagccgtggt 
aaggagatgc 
agctgatccc 
acagcacacg 
tggagatgga 
aggctgtcag 
agcgggggga 
tggagctggg 
tgtttgttta 
ctgctgctag 
aggcccttaa 
agataactga 
agcccagaac 
ttaagcgctc 
agaatggatc 
ctgccctgtc 
agacttgcag 
cccctcttct 
cacacagccc 
catatcaggg 
gctacgatgc 



cagaatgagt 
ccctcctcta 
tgcagactca 
ctcaggaccc 
tatggtagat 
cctcctggga 
cctagtggtg 
tctgtgccac 
ccatgccgtg 
ggccctgacc 
cagggacatc 
cttccatcgg 
atgcaatgta 
ccttgagcag 
atgatctcta 
gccttccagg 
gaaatcatgt 
ttttccttct 
attaagcaga 
ccaacttctc 
ctgtcctgtc 
cctttcagct 
gaaacaggat 
acacctggcc 
aaagaaaagt 
caactttgaa 



FIGURE 18G: Predicted amino acid sequence of human MOCS1, isoform 2 protein 
(SEQIDNO:22) 

1 mwkswklrtd vrvregaggs pea's sqpgsr gpcflpglss qevsrrrqfl rehaapfsaf 

61 ltdsfgrqhs ylrisltekc nlrcqycmpe egvpltpkan lltteeiltl arlfvkegid 

121 kirltggepl irpdwdiva qlqrleglrt igvttnginl arllpqlqka glsainisld 

181 tlvpakf efi vrrkgf hkvm egihkaielg ynpvkvncw mrglnedell dfaalteglp 

241 ldvrf ieymp fdgnkwnfkk mvsykemldt vrqqwpelek vpeeesstak afkipgfqgq 

3 01 isfitsmseh fcgtcnrlri tadgnlkvcl fgnsevslrd hlragaseqe llriigaavg 

3 61 rkkrqliagmf sisqmknrpm iliklflmfp nsppanpsif swdplhvqgl rprmsfssqv 

421 atlwkgcrvp qtpplaqqrl gsgsfqrhyt sradsdansk clspgswasa apsgpqltse 

481 qlthvdsegr aamvdvgrkp dtervavasa wllgpvafk lvqqnqlkkg dalwaqlag 

541 vqaakvtsql iplchhvals hiqvqlelds trhavkiqas crargptgve mealtsaava 
601 altlydmcka vsrdivleei klisktggqr gdfhra 



FIGURE 18H: Predicted nucleotide sequence of human MOCS1, isoform 3 protein 
(SEQ ID NO:23) 

1 gecagaaate ttcccagtag agatcaccat ccgcccccga cccccaagaa tacttaaggg 

61 gtgggtcctt cccatcaagc tgatttctca acgagaggga caatcccagc ttccccaaca 

121 ttgeagagee caaacatgtg gaagagttgg aagctccgca cagatgtcag agtaagggag 

181 ggggcaggcg gttctccttg tgcctcttcc cagcccggta gcaggggccc atgcttcctc 

241 cctggtctgt cctcgcagga ggtgtccagg eggaggcagt tectgeggga gcatgcggcc 

3 01 cccttctccg ccttcctcac agacagcttc ggccggcagc acagctacct gcggatctcc 

3 61 ctcacagaga agtgcaacct cagatgtcag tactgeatge ccgaggaggg ggtcccgctg 

421 acccccaaag ccaacctgct gaccacagag gagatcctga ccctcgcccg gctctttgtg 

481 aaggaaggca tcgacaagat ccggctcaca ggtggagagc cgcttatccg geeggaegtg 
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541 gtggacattg tggcccagct ccagcggctg 
601 aatggcatca acctggcccg gctactgccc 
661 aacatcagcc tggacaccct ggtgcctgcc 
721 ttccacaagg tcatggaggg catccacaag 
781 gtgaactgtg tggtgatgcg aggccttaac 
841 actgagggcc tccccctgga tgtgcgcttc 
9 01 tggaacttca agaagatggt cagctataag 
9 61 ccagagctgg agaaggtgcc agaggaggaa 
1021 ggcttccaag gccagatcag cttcatcaca 
1081 aaccgcctgc gaatcacagc tgatgggaac 
1141 gtatccctgc gggatcacct gcgagctggg 
1201 ggggctgctg tgggcaggaa gaagcggcag 
1261 tccccaccag ccaatccaag cattttctcc 
1321 cccagaatga gtttctccag ccaggtggcc 
1381 acccctcctc tagcccagca gcggctgggg 
1441 cgtgcagact cagatgccaa ctcaaagtgc 
1501 ccctcaggac cccagctaac ctcagaacaa 
1561 gctatggtag atgtgggcag gaagccagac 
1621 gtcctcctgg gaccggtagc cttcaagctt 
1681 gccctagtgg tggcccagct ggctggagtc 
1741 cctctgtgcc accacgtggc cctgagccac 
1801 cgccatgccg tgaagatcca ggcatcttgc 
1861 gaggccctga cctctgctgc agtggccgcc 
1921 agcagggaca tcgtgttgga ggagatcaag 
1981 gacttccatc gggcttagca cctgcccttc 
2041 ggatgcaatg taggctgagg gaaagacgtc 
2101 taccttgagc agtaaacccg aagtcagcct 
2161 agatgatctc taatgaccaa tggggcttcc 
2221 aagccttcca ggacactaag gtcgtgggag 
2281 gagaaatcat gttctttgtg gactatttca 
2341 acttttcctt ctcattttgg gttttctctt 
2401 tcattaagca gaggagccca ctttgaggag 
2461 tcccaacttc tccccggcag ctctgcctcc 
2521 tcctgtcctg tcctgacttc atctctcctt 
2581 agcctttcag cttaacagtt gcccggtcct 
2 641 ctgaaacagg atgtgcacac atgggccata 
2701 ccacacctgg ccctgttcac ggctgttcca 
2761 ggaaagaaaa gttgatgata gattgccttc 
2821 gccaactttg aaa 



gaagggctga gaaccatagg tgttaccacc 
cagcttcaga aggctggtct cagtgccatc 
aagtttgagt tcattgtccg caggaaaggc 
gccatcgagc tgggctacaa ccctgtgaag 
gaggatgaac tcctggactt tgcggccttg 
atagagtata tgccctttga tggcaacaag 
gagatgctag acactgtccg gcagcagtgg 
tccagcacag ccaaggcctt taaaatccct 
tccatgtctg agcatttctg tgggacctgc 
ctcaaggtct gcctctttgg aaactctgag 
gcctctgagc aggagctgct gagaatcatt 
catgcaaagt tatttttgat gttccccaat 
tgggacccgc tccatgttca gggtctaaga 
actttatgga aaggatgcag ggtcccccag 
tctggctcct ttcagagaca ctacacttcc 
cttagcccag gttcctgggc ttctgctgcc 
ctaactcatg tggactcgga aggacgggca 
acagagcggg tggctgtggc ttcagccgtg 
gtccagcaga accagctcaa gaaaggagat 
caggcagcca aggtgaccag ccagctgatc 
atccaggtgc agctggagct ggacagcaca 
cgggctcggg gccccaccgg ggtggagatg 
ctcaccctgt atgacatgtg caaggctgtc 
ctcattagca agactggtgg tcagcggggg 
tcacccatgg cccacccagg cctggagctg 
aggttccttt aatcacagtc actgtttgtt 
gctctactac taacaaacag gcctgctgct 
tttctatagg gaggatacca gcaggccctt 
cgggactgca acaagcaatg ccagataact 
gacaaccagg ttccgacagt ccagcccaga 
ctcctgcttt cctggggaga gattaagcgc 
agcaaagcac aagcttgcct gaagaatgga 
ctaagtctgt gaagccgcag ccctgccctg 
ctgcccaagt ctgtgtccca tcagacttgc 
gctggcccct tttcctctgg cccccctctt 
gccctaagga ctcctgccag accacacagc 
cgcacccctc tttattctgg agcatatcag 
accctcacag cgcacaaata aagctacgat 



FIGURE 181: Predicted amino acid sequence of human MOCS1, isoform 3 protein 
(SEQ ID NO:24) 

1 mwkswklrtd vrvregaggs pcassqpgsr gpcflpglss qevsrrrqfl rehaapfsaf 

61 ltdsfgrqhs ylrisltekc nlrcgycmpe egvpltpkan lltteeiltl arlfvkegid 
121 kirltggepl irpdwdiva qlqrleglrt igvttnginl arllpqlqka glsainisld 
181 tlvpakfefi vrrkgfhkvm egihkaielg ynpvkvncw mrglnedell dfaalteglp 
241 ldvrfieymp fdgnkwnfkk mvsykemldt vrqqwpelek vpeeesstak afkipgfqgq 
3 01 isfitsmseh fcgtcnrlri tadgnlkvcl fgnsevslrd hlragaseqe llriigaavg 
3 61 rkkrqhaklf lmfpnsppan psifswdplh vqglrprmsf ssqvatlwkg crvpqtppla 
421 qqrlgsgsfq rhytsradsd anskclspgs wasaapsgpq ltseqlthvd segraamvdv 
481 grkpdterva vasawllgp vafklvqqnq lkkgdalwa qlagvqaakv tsqliplchh 
541 valshiqvql eldstrhavk iqascrargp tgvemealts aavaaltlyd mckavsrdiv 
601 leeikliskt ggqrgdfhra 
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FIGURE 19. CLUSTAL W (1.82) Protein Sequence Alignment Analysis 

Mocsl-2 Hs MWKSWKLRTDVRVREGAGGSPCASSQPGSR GPCFLPGLSSQEVSRRRQFLREHAA 

Mocsl-3 Hs MWKSWKLRTDVRVREGAGGSPCASSQPGSR GPCFLPGLSSQEVSRRRQFLREHAA 

Mocsl-1 Hs MWKSWKLRTDVRVREGAGGSPCASSQPGSR GPCFLPGLSSQEVSRRRQFLREHAA 

Mocsl Hs -MAARPLSRMLRRLLRSSARSCSSGAPVTQP C PGES ARAAS EEVSRRRQFLREHAA 

Mocsl-PA Dm MRLLARHAIRLLGQENSAGEVASLSRGAIRLKATTGYLNLATASVQPLEPEKQVLRKNSP 
Mocsl PC Dm MRLLARHAIRLLGQENSAGEVASLSRGAIRLKATTGYLNLATASVQPLEPEKQVLRKNSP 

Mocsl-2 Hs PFSAFLTDSFGRQHSYLRISLTEKCNLRCQYCMPEEGVPLTPKANLLTTEEILTLARLFV 
Mocsl-3 Hs PFSAFLTDSFGRQHSYLRISLTEKCNLRCQYCMPEEGVPLTPKANLLTTEEILTLARLFV 
Mocsl-1 Hs PFSAFLTDSFGRQHSYLRISLTEKCNLRCQYCMPEEGVPLTPKANLLTTEEILTLARLFV 
Mocsl Hs PFSAFLTDSFGRQHSYLRISLTEKCNLRCQYCMPEEGVPLTPKANLLTTEEILTLARLFV 

Mocsl-PA Dm LTDSFGRHHTYLRISLTERCNLRCDYCMPAEGVPLQPKNKLLTTEEILRLARIFV 

Mocsl PC Dm LTDSFGRHHTYLRISLTERCNLRCDYCMPAEGVPLQPKNKLLTTEEILRLARIFV 

Mocsl-2 Hs KEGIDKIRLTGGEPLIRPDWDIVAQLQRLEGLRTIGVTTNGINLARLLPQLQKAGLSAI 
Mocsl-3 Hs KEGIDKIRLTGGEPLIRPDWDIVAQLQRLEGLRTIGVTTNGINLARLLPQLQKAGLSAI 
Mocsl-1 Hs KEGIDKIRLTGGEPLIRPDWDIVAQLQRLEGLRTIGVTTNGINLARLLPQLQKAGLSAI 
Mocs 1 Hs KEGIDKIRLTGGEPLIRPDWDIVAQLQRLEGLRTIGVTTNGINLARLLPQLQKAGLSAI 
Mocsl-PA Dm EQGVRKIRLTGGEPTVRRDIVEIVAQMKALPELEQIGITTNGLVLTRLLLPLQRAGLDNL 
Mocsl PC Dm EQGVRKIRLTGGEPTVRRDIVEIVAQMKALPELEQIGITTNGLVLTRLLLPLQRAGLDNL 

Mocsl-2 Hs NISLDTLVPAKFEFITORKGFHKVMEGIHKAIELGYNPVKVNCVVm 
Mocs 1 - 3 Hs NISLDTLVPAKFEFIVRRKGFHKA/MEGIHKAIELGYNP 

Moc s 1- 1 Hs NI SLDTLVPAKFEFIVRRKGFHKVMEGIHKAIELGYNPVKVWCVVMRGLNEDELLDFAAL 
Mocsl Hs NISLDTLVPAKFEFIVRRKGFHKVMEGIHKAIELGYNPVKVNCVVMRGLNEDELLDFAAL 
Mocsl-PA Dm NISLDTLKRDRFEKITRRKGWERVIAGIDLAVQLGYRP-KVNCVLMRDFNEDEICDFVEF 
Mocsl PC Dm NISLDTLKRDRFEKITRRKGWERVIAGIDLAVQLGYRP-KVNCVLMRDFNEDEICDFVEF 

Mocsl-2 Hs TEGLPLDWFIEYMPFDGNKWNFKKMVSYKEMLDTTOQQWPELEKVPEEESSTAKAFKIP 
Mocsl-3 Hs TEGLPLDVRFIEYMPFDGNKWNFKKMVSYKEML 

MOC S 1 - 1 Hs TEGLPLDVRFIEYMPFDGNKWNFKKMVSYKEMLDTVIIQQWPELEKVPEEESSTAKAF P 
Mocsl Hs TEGHPLDVRFIEYMPFDGJKTKWNFKKMVSYKEMLD 

Mocsl-PA Dm TRNRPVDVRF I EYMPF S GNKWHTERL I S YKDTLQI I RQRWPDFKALPNGPNDTS KAYAVP 
Mocsl PC Dm TRNRPVDVRFIEYMPFSG3STKWHTERLIS YKDTLQI IRQRWPDFKALPNGPNDTSKAYAVP 

Mocsl-2 Hs GFQGQISFITSMSEHFCGTCNRLRITADGNLKVCLFGNSEVSLRDHLR-AGASEQELLRI 
Mocsl-3 Hs GFQGQISFITSMSEHFCGTCNRLRITADG3SFLKVCLFGNSEVSLRDHLR-AGASEQELLRI 
Mocsl-1 Hs GFQGQISFITSMSEHFCGTCNRLRITADGNLKVCLFGNSEVSLRDHLR-AGASEQELLRI 
Mocsl Hs GFQGQISFITSMSEHFCGTCNRLRITADGNLKVCLFGNSEVSLRDHLR-AGASEQELLRI 
Mocsl-PA Dm GFKGQVGFITSMTEHFCGTCNRLRLTADGNIKVCLFGNKEFSLRDAMRDESVSEEQLVDL 
Mocsl PC Dm GFKGQVGFITSMTEHFCGTCNRLRLTADGNIKVCLFGlSrKEFSLRDAMRDESVSEEQLVDL 

Mocsl-2 Hs I GAAVGRKKRQHAGMFS I S QMKNRPMI L I KLFLMF PNS PPANP S I FS WDPLHVQGLRPRM 

Mocsl-3 Hs I GAAVGRKKRQH A KLFLMF PNS PPANP S I F S WDPLHVQGLRPRM 

Mocsl-1 Hs I GAAVGRKKRQH AG MFSISQMKNf RPMI 

Mocsl Hs I GAAVGRKKRQH AG MFSISQMKN RPMI 

Mocsl-PA Dm I GAAVQRKKKQH AG ML 

Mocsl PC Dm I GAAVQRKKKQHAD A APRLHHHLHPYSYHHAYHTSRLQLQAR 

Mocsl-2 Hs SFSSQVATLWKGCRVPQTPPLAQQRLGSGSFQRHYTSRADSDAJSTSKCLSPGSWASAAPSG 
Mocsl-3 Hs SFSSQVATLWKGCRVPQTPPLAQQRLGSGSFQRHYTSRADSDANSKCLSPGSWASAAPSG 

Mocsl-1 Hs LIGG 

Mocsl Hs LIGG 

Mocsl-PA Dm NLS 
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Mocsl PC Dm NYS — 

Mocsl-2 Hs PQLTSEQLTHVDSEGRAAIWDVGRKPDTERVAVASAWLLGPVAFKLVQQNQLKKGDALV 
Mocsl-3 Hs PQLTSEQLTHVDSEGRAAMVDVGRKPDTERVAVASAWIiLGPVAFKLVQQNQLKKGDALV 

Mocsl-1 Hs 

Mocsl Hs 

Mocsl-PA Dm -QMENR PMILIGG 

Mocsl PC Dm QLTHVDGQGKAQMVDVGAKP STTRL ARAEATVQVGEKLT QL I ADNQVAKGDVLT 

Mocsl-2 HS VAQLAGVQAAKVTSQLI PLCHHVALSHI QVQL ELDS TRHAVK I Q AS CRARGPTGVEMEAL 
Mocsl-3 HS VAQLAGVQAAK^TSQIilPLCHHVALSHIQVQLELDSTRHAVKIQASCRARGPTGVEMEAL 

Mocsl-1 Hs 

Mocsl Hs 

Mocsl-PA Dm 

Mocsl PC Dm VAQIAGIMGAKRTAELI PLCHNISLS SVKVQATLLKTEQSVRLEATVRCSGQTGVEMEAL 

Mocsl-2 HS TSAAVAALTLYDMCKAVSRDIVLEEIKLISKTGGQRGDFHRA — 

Mocsl-3 Hs TSAAVAALTLYDMCKAVSRDIVLEEIKIilSKTGGQRGDFHRA 

Mocsl-1 Hs — 

Mocsl HS 

Mocsl-PA Dm 

Mocsl PC Dm TAVSVAALTVYDMCKAVSHDICITNVRIiLSKSGGKRDFQREEPQNGIVTEVE 
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FIGURE 21. Triglyceride content of a peanut (pnut; Gadfly Accession Number CG8705) 
mutant 
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FIGURE 23: HUMAN HOMOLOG OF CG8705 (peanut) 

FIGURE 23A. BLASTP search results for peanut (Gadfly Accession Number CG8705) 
Homology to human CDC10 protein (gene ref NMJ)01788; protein ref NPJ)01779.1) 

>ref |NP_001779 .1 1 (NM_001788 ) cell division cycle 10; cell division cycle 

10 (homolog to CDC10 of S.cerevisiae) ; cell division cycle 10 (homologous 

to CDC10 of S. cerevisiae) ; CDC10 (cell division cycle 10, S. cerevisiae, 

homolog) ; CDC10 protein homolog [Homo sapiens] 

sp | Q16181 | SEP7_HUMAN SEPTIN 7 (CDC10 PROTEIN HOMOLOG) 

pir| | JC2352 hCDCIO protein - human 

gb| AAB31337 ,l| (S72008) CDC10 homolog [Homo sapiens] 
Length = 418 

Score = 548 bits (1398), Expect = e-155 

Identities = 273/419 (65%), Positives = 331/419 (78%), Gaps = 9/419 (2%) 

Query: 113 RQKPMEIAGYVGFANLPNQVYRKAVKR^ 172 

+QK +E GYVGFANL PNQVYRK+ VKRGFEFTLMWG SGLGKSTLINS+FL+D+Y+ E 
Sbjct: 4 QQKNLE — GYVGFANLPNQVYRKSVKRGFEFTLMWGESGLGKSTLINSLFLTDLYSPE- 60 

Query: 173 YPG P S LRKKKTVAVEATKTViyiLKENGVNLTLTVVDT PGFGDA VDNSNCWVP I LEYVD SKYE 232 

YPGPS R KKTV VE +KV++KE GV L LT+VDTPGFGDAVDNSNCW P-f ++Y+DSK+E 
Sb j c t : 61 YPGPSHRIKKTVQVEQSKVLIKEGGVQLLLTIVDTPGFGDAV1DNSNCWQPVIDYIDSKFE 12 0 

Query: 233 EYLTAESRVTRKTISDSRVHCCLYFIAPSGHGLLPLDIACMQSLSDKVNLVPVIAKM 292 

+YL AESRV R+ + D+RV CCLYFIAPSGHGL PLDI M+ L +KVN+ + P+IAKADT + 
Sbjct: 121 DYLNAESRV]STEIRQMPD3^TRVQCCLYFIAPSGHGLKPLDIEFMKRLHEKVNI I PLIAKADTL 180 

Query: 293 TPDEVHLFKKQILNEIAQHKIKIYDFPATLEDAAEEAKTTQNLRSRVPFAWGANTIIEQ 352 

TP+E FKKQI+ EI +HKIKIY+FP T D EE K + ++ R+P AWG+NTI I E 
Sbjct: 181 TPEECQQFKKQIMKEIQEHKIKIYEFPET — DDEEENKLVKKI KDRL PL AVVGSNT 1 1 EV 23 8 

Query: 353 DGKKVRGRRYPWGLVEVENLTHCDFI^ 412 

+GK+VRGR+YPWG+ EVEN HCDF LRNM I RTH+ QDLKD VTNNVHYENYR RKL+ + 
Sbjct: 239 NGKRWGRQYPWGIAEVENGEHCDFTILRNMKIRT 298 

Query: 413 GLVIDGKAR-LSNKNPLTQMEEEKREHEQKMKKM^ 468 

G+ + K + K+PL QMEEE+REH KMKKME EMEQVF +MKVKEK+ QKL + DS E 

Sbjct: 299 TYNGVDNNKNKGQLTKSPLAQMEEERREHVAKMKKMEMEMEQV 358 

Query: 469 ELARRHEERKKALELQIRELEEKRREFEREKKEWEDVTSIHVTLEELKRRSLGANSSTDNV 527 

EL RRHE+ KK LE Q +ELEEKRR+FE EK WE + ++ R+L N + 

Sbjct: 359 ELQRRHEQMKKNLEAQHKELEEKRRQFEDEKANWEAQQRILEQQNSSRTLEKNKKKGKI 417 



Homology to human Septin 7 (gene ref XMJ)11595; protein ref XPJ)1 1595.4) 

>ref | | (XM_011595) similar to SEPTIN 7 (CDC10 PROTEIN HOMOLOG) (H. 
sapiens) [Homo sapiens] , Length = 384 

Score = 498 bits (1268) , Expect = e-139 

Identities = 246/386 (63%), Positives = 302/386 (77%), Gaps = 7/386 (1%) 
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Query: 146 MVVGASGLGKSTLINSMFLSDIYNAEQYPGPSLRKKKT 205 

MWG SGLGKSTLINS+FL+D+Y+ E YPGPS R KKTV VE +KV++KE GV L LT+V 
Sbjct : 1 MVVGESGLGKSTLINSLFLTDLYSPE-YPGPSHRIKKTVQVEQSKVLIKEGGVQLLLTIV 59 

Query: 206 DTPGFGDAVDNSNCWVPILEYVDSKYEEYLTAESRVYRKTISDSRVHCCLYFIAPSGHGL 265 

DTPGFGDAVDNSNCW P+ ++Y4-DSK+E+YL AESRV R+ + D+RV CCLYFIAPSGHGL 
Sbjct: 60 DTPGFGDAVDNSNCWQPVIDYIDSKFEDYLNAESRVNRRQMPDNRVQCCLYFIAPSGHGL 119 

Query: 266 LPLDIACMQSLSDKVBnjVPVIAKADTM 325 

PLDI M+ L +KVN++P+IAKADT+TP+E FKKQI+ EI +HKIKIY+FP T D 
Sbjct: 120 KPLDIEFMKRLHEKA7NIIPLIAKADTLTPEECQQFKKQIMKEIQEHKIKIYEFPET — DD 177 

Query: 326 AEEAKTTQNLRSRVPFAWGANT 1 1 EQDGKKA^GRRYPWGLVEVENLTHCDF I ALRJSTMVI 385 

EE K + ++ R+P AWG+3STTIIE +GK+VRGR+YPWG+ EVEN HCDF LRNM+I 
Sbjct: 178 EEENKLVKKI KDRL PLAWGSNT 1 1 EVTSTGKRVRGRQYPWGVAEVENGEHCDFT I LRNML I 237' 

Query: 3 86 RTHLQDLKDVTNNVHYENYRCRKLSEL GLVDGKAR- LSNKNPLTQMEEEKREHEQKM 441 

RTH+QDLKDVTNNVHYENYR RKL+ + G+ + K + K+PL QMEEE+REH KM 

Sbjct: 238 RTHMQDLKDVTNWH^ 297 

Query: 442 KKMEAEMEQVFDMKVKEKMQKLRDSELELARRHEERKKALE^ 501 

KKME EMEQVF+MKVKEK+QKL+DSE EL RRHE+ KK LE Q +ELEEKRR+FE EK 
Sbjct: 298 KKMEMEMEQVFEMK^ 357 

Query: 502 WEDVNHVTLEELKRRSLGANSSTDNV 527 

WE + ++ R-KD N + 

Sbjct: 358 WEAQQRILEQQNSSRTLEKNKKKGKI 383 



FIGURE 23B: Predicted nucleotide sequence encoding human CDC10 cell division cycle 
10 homolog (SEQ ID NO:25) 

1 agtgcgagat ccgctgctgc tgaggagagg agcgtcaaca gcagcaccat ggtagctcaa 
61 cagaagaacc ttgaaggcta tgtgggattt gccaatctcc caaatcaagt atacagaaaa 
121 tcggtgaaga gaggttttga attcacgctt atggtagtgg gtgaatctgg attgggaaag 
181 tcgacattaa tcaactcatt attcctcaca gatttgtatt ctccagagta tccaggtcct 
241 tctcatagaa ttaaaaagac tgtacaggtg gaacaatcca aagttttaat caaagaaggt 
3 01 ggtgttcagt tgctgctcac aatagttgat accccaggat ttggagatgc agtggataat 
3 61 agtaattgct ggcagcctgt tatcgactac attgatagta aatttgagga ctacctaaat 
421 gcagaatcac gagtgaacag acgtcagatg cctgataaca gggtgcagtg ttgtttatac 
481 ttcattgctc cttcaggaca tggacttaaa ccattggata ttgagtttat gaagcgtttg 
541 catgaaaaag tgaatatcat cccacttatt gccaaagcag acacactcac accagaggaa 
601 tgccaacagt ttaaaaaaca gataatgaaa gaaatccaag aacataaaat taaaatatac 
661 gaatttccag aaacagatga tgaagaagaa aataaacttg ttaaaaagat aaaggaccgt 
721 ttacctcttg ctgtggtagg tagtaatact atcattgaag ttaatggcaa aagggtcaga 
781 ggaaggcagt atccttgggg tattgctgaa gttgaaaatg gtgaacattg tgattttaca 
841 atcctaagaa atatgaagat aagaacacac atgcaggact tgaaagatgt tactaataat 
901 gtccactatg agaactacag aagcagaaaa cttgcagctg tgacttataa tggagttgat 
9 61 aacaacaaga ataaagggca gctgactaag agccctctgg cacaaatgga agaagaaaga 
1021 agggagcatg tagctaaaat gaagaagatg gagatggaga tggagcaggt gtttgagatg 
1081 aaggtcaaag aaaaagttca aaaactgaag gactctgaag ctgagctcca gcggcgccat 
1141 gagcaaatga aaaagaattt ggaagcacag cacaaagaat tggaggaaaa acgtcgtcag 
1201 ttcgaggatg agaaagcaaa ctgggaagct caacaacgta ttttagaaca acagaactct 
1261 tcaagaacct tggaaaagaa caagaagaaa gggaagatct tttaaactct ctattgacca 
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1321 ccagttaacg tattagttgc caatatgcca gcttggacat cagtgtttgt tggatccgtt 
13 81 tgaccaattt gcaccagttt tatccataat gatggattta acagcatgac aaaaattatt 
1441 tttttttttg ttcttgatgg agattaagat gccttgaatt gtctagggtg ttctgtactt 
1501 agaaaghaag agctctaagt acctttccta cattttcttt ttttattaaa cagatatctt 
1561 cagtttaatg caagagaaca ttttactgtt gtacaatcat gttctggtgg tttgattgtt 
1621 tacaggatat tccaaaataa aaggactctg gaagattttc attgaggata aattgccata 
1681 atatgatgca aactgtgctt ctctatgata attacaatac aaaggttcca ttcagtgcag 
1741 catatacaat aatgtaattt agtctaacac agttgaccct attttttgac acttccattg 
1801 tttaaaaata cacatggaaa aaaaaaaacc ctatatgctt actgtgcacc tagagctttt 
1861 ttataacaac gtctttttgt ttgtttgttt tggattcttt aaatatatat tattctcatt 
1921 tagtgccctc tttagccaga atctcattac tgcttcattt ttgtaataac atttaattta 
1981 gatattttcc atcatattgg cactgctaaa atagaatata gcatctttca tatggtagga 
2041 accaacaagg aaactttcct ttaactccct ttttacactt tatggtaagt agcagggggg 
2101 gaaatgcatt tatagatcat ttctaggcaa aattgtgaag ctaatgacca acctgtttct 
2161 acctatatgc agtctcttta ttttactaga aatgggaatc atggcctctt gaagagaaaa 
2221 aagtcaccat tctgcattta gctgtattca tatattgcta tttctgtatt ttttgtttgt 
2281 attgtaaaaa attcacataa taaacgatgg ttgtgatgt 



FIGURE 23C: Predicted amino acid sequence of human CDC10 cell division cycle 10 
homolog (SEQ ID NO:26) 

1 mvaqqknleg yvgf anlpnq vyrksvkrgf eftlmwges glgkstlins lfltdlyspe 
61 ypgpshrikk tvqveqskvl ikeggvqlll tivdtpgfgd avdnsncwqp vidyidskfe 
121 dylnaesrvn rrqmpdnrvq cclyfiapsg hglkpldief mkrlhekvni ipliakadtl 
181 tpeecqqfkk qimkeiqehk ikiyefpetd deeenklvkk ikdrlplaw gsntiievng 
241 krvrgrqypw giaevengeh cdf tilrnink irthmqdlkd vtnnvhyeny rsrklaavty 
3 01 ngvdxmknkg qltksplaqm eeerrehvak mkkmemeirieq vfemkvkekv qklkdseael 
361 qrrheqmkkn leaqhkelee krrqfedeka nweaqqrile qqnssrtlek nkkkgkif 
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FIGURE 24. CLUSTAL W (1.7) Protein Sequence Alignment Analysis 

XM_011595 

NM__001788 " 

pnut MlMS PRSNAVNGGSGGAI S ALPSTL AQLALRDKQQAAS AS AS S ATNGS SGSESLVGVGGRP 

XM_011595 

NM_0 01788 MVAQQK- -NLE 

pnut PNQPPSVPVAASGKLDTSSGGASNGDSNKLTHDLQEKEHQQAQKPQKPPLPWQKPMEIA 

XM_011595 MVVGESGLGKSTLINSLFLTDLY-SPEYPGPSHRI 

NMJD0178 8 GYVGFANLPNQVYRKSVKRGFEFTLMWGESGLGKSTLINSLFLTDLY-SPEYPGPSHRX 
pnut GYVGFAm.PNQVYRKAVKRGFEFTLMVVGASGLGKSTLINSMFLSDIYNAEQYPGPSLRK 

**** *********** * - * 



XM__0 11595 KKTVQVEQSKVLIKEGGVQLLLTIVDTPGFGDAVDNSNCWQPVIDYIDSKFEDYLNAESR 
1STM_0 0178 8 KKTVQVEQSKVLIKEGGVQLLLTIVDTPGFGDAVDNSNCWQPVIDYIDSKFEDYLNAESR 

pnut KKWAVEATKVMLKENG^^ 

**** * * .**..****.* **.**************** * ::: * : *** : *-** - **** 

XM_0 11595 VNRRQMPDNRVQCCLYFIAPSGHGLKPLDI EFMKRLHEKVNI I PLIAKADTLTPEECQQF 

NM_00178 8 VNRRQMPDNRVQCCLYFIAPSGHGLKPLDXEFiyiKRLHEKVNI I PLIAKADTLTPEECQQF 

pnut VYRKTISDNRVHCCLYFIAPSGHGLLPLDIACMQSLSDKVNLVPVIAKADTMTPDEVHLF 
* *. m **** z ************* **** *. ★ .***..*.***•***.**.* . * 

XM_01159 5 KKQI3^EIQEHKIKIYEFPETDDE--EENKLVKKIKDRLPLAWGSNTIIEVNGKRVRGR 
NM_001788 KKQIMKEIQEHKIKIYEFPETDDE--EENKLVKKIKDRLPLAWGSNTIIEVNGKRVRGR 
pnut KKQI LNEI AQHKIKXYDF PATLEDAAEEAKTTQNLRSRVPFAWGANTI I EQDGKKVRGR 

****..** .******.** * . . * * * .... *.*.****.***** .**.**** 

XM_01159 5 QYPWGVAEVENGEHCDFT I LRNML I RTHMQDLKDVTJSnSTVHYElSnf RSRKLAAVT YNGVDNN 

NM_0 0178 8 Q YP WG I AEVENGEH CDFT I LRNMK I RTHMQDLKDVTNNVH YENYR S RKL AAVT YN*GVDNN 

pnut RYPWGLVEVENLTHCDFIALRJSIMVIRTHLQDLKDV^^ — VDGK 

.****.**** **** **** ****.**************** # *** : : **.: 

XM_0 11595 KNKGQLTKSPLAQMEEERREHVAKMKKMEM 
NM_0 01788 KNKGQLTKSPLAQMEEERREHVAKMKKMEM 
pnut ARLS — NKNPLTQMEEEKREHEQKMKKMEAEMEQVFD^ 

★ **.*****.*** ****** *** ************** : *** ** **** . 



XM_0 11595 MKKNLEAQHKELEEKRRQFEDEKANWEAQQRILEQQNS SRTLEKN KKKG 

NMJD0178 8 MKKNLEAQHKELEEKRRQFEDEKANWEAQQRILEQQNSSRTLEKN KKKG 

pnut RKKALELQIRELEEKRREFEREKKEWEDWHVTLEELKRRSLGANSSTDIWDGKKEKKKK 
** ** * .*******.** ** .** ... .. m *.* * *** 

XMJD11595 KIF 

NM_001788 KIF 

pnut GLF 
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FIGURE 26. Triglyceride content of a pyruvate kinase (Gadfly Accession Number 
CG7069) mutant 
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FIGURE 28: HUMAN HOMOLOG OF CG7069 

FIGURE 28A. BLASTP result for CG7069 (Gadfly Accesssion Number) 
Homology to human gene ref XMJB7768; ref XPJ)37768.1 protein 

>ref | XP_037768 . 1 1 (XM_037768 ) pyruvate kinase, muscle [Homo sapiens] 



gb 

gb 

gb 



AAH00481.1 
AAH07640.1 
AAH07952 .1 



Length = 531 



AAH00481 (BC000481) pyruvate kinase, muscle [Homo sapiens] 
AAH07640 (BC007640) pyruvate kinase, muscle [Homo sapiens] 
AAH07952 (BC007952) pyruvate kinase, muscle [Homo sapiens] 



Score = 410 bits (1043), Expect = e-113 

Identities = 209/412 (50%), Positives = 284/412 (68%), Gaps = 2/412 (0%) 



Query: 1 MRVVRMNFSHGSHEYHCQTI^ 60 

M V R+NFSHG+HEYH +TI + R A + L R +A+ALDTKGPEIRTG L G+ 

Sbjct: 69 MWARLNFSHGTHEYHAETIKWRTATESFASDPILYRPVAVALDTKGPEIRTG-LIKGS 127 

Query: 61 DRAEIELKTGDKVTLSTKKEMADKS^ 120 

AE+ELK G + ++ +K + + + +++DY+ + ++V+ G+ +++VDDGLI+L VK+ 

Sbjct: 12 8 GTAEVELKKGATLKITLDNAYMEKCDENIL^^ 187 

Query: 121 SKGDEVICQVENGGKLGSHKGINLPGVPVDLPSVTEKDKQDLKFGAEQKVDMIFASFIRD 180 

D ++ +VENGG LGS KG+NDPG VDLP+V+EKD QDLKFG EQ VDM+FASFIR 
Sbjct: 188 KGADFLVTEVENGGSLGSKKGVNLPGAAVDLPAVSEKDIQDLKFGVEQDVDWFASFIRK 247 

Query: 181 ANALKEIRQVLGPAGACIKI I SKI EISTHQGLVNIDDI IRESDGIMVARGDMGIEI PTEDVP 240 

A+ + E+R+VLG G IKIISKIENH+G+ D+I + SDGIMVARGD+GIEIP E V 
Sbjct: 248 ASDVHEVRKVLGEKGKNIKI I SKI ENHEGVRRFDEILEASDGIMVARGDLGIEI PAEKVF 3 07 

Query: 241 LAQKSIVAKCISTECVGKPVICATQMMESMTNKPRPTRAEASDVANAIFDGCDA^ 3 00 

LAQK ++ +CN+ GKPVICATQM+ESM KPRPTRAE SDVA2STA+ DG D +MLSGETAK 
Sb j c t : 3 08 LAQKMMIGRCNRAGKPVICATQMLESMIKKPRPTRAEGSDVANAVLDGADCIMLSGETAK 3 67 

Query: 3 01 GKYPVECV QCMARI CAKVEAVLWYES LQNS LKREI RT S AADH I S AVTTAI AEAATVGQAR 360 

G YP+E V+ I + EA +++ L L+R + +D A EA+ + 

Sb j ct : 3 68 GDYPLEAVRMQHLI AREAEAAIYHLQLFEELRR-LAPITSDPTEATAVGAVEASFKCCSG 426 

Query: 3 61 AIWASPCSMVAQWSHMRPPCPIVMLTGNESEAAQSLLFRGIYPLLVEEMV 412 

AI+V + A V+ RP PI+ +T N A Q+ L+RGI + P+L ++ V 

Sbjct: 427 AIIVLTKSGRSAHQVARYRPRAPIIAVTRNPQTARQAHLYRGIFPVLCKDPV 478 



FIGURE 28B: Predicted nucleotide sequence encoding human pyruvate kinase, muscle 
(SEQ ID NO:27) 

1 cggcggcccg cagcgggata accttgaggc tgaggcagtg gctccttgca cagcagctgc 
61 acgcgccgtg gctccggatc tcttcgtctt tgcagcgtag cccgagtcgg tcagcagccg 
121 gaggacctca gcagccatgt cgaagcccca tagtgaagcc gggactgcct tcattcagac 
181 ccagcagctg cacgcagcca tggctgacac attcctggag cacatgtgcc gcctggacat 
241 tgattcacca cccatcacag cccggaacac tggcatcatc tgtaccattg gcccagcttc 
301 ccgatcagtg gagacgttga aggagatgat taagtctgga atgaatgtgg ctcgtctgaa 
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3 61 cttctctcat ggaactcatg agtaccatgc 
421 ggaaagcttt gcttctgacc ccatcctcta 
481 aggacctgag atccgaactg ggctcatcaa 
541 gaagggagcc actctcaaaa tcacgctgga 
601 catcctgtgg ctggactaca agaacatctg 
661 cgtggatgat gggcttattt ctctccaggt 
721 ggaggtggaa aatggtggct ccttgggcag 
781 tgtggacttg cctgctgtgt cggagaagga 
841 ggatgttgat atggtgtttg cgtcattcat 
9 01 gaaggtcctg ggagagaagg gaaagaacat 
961 gggggttcgg aggtttgatg aaatcctgga 
1021 tgatctaggc attgagattc ctgcagagaa 
1081 acggtgcaac cgagctggga agcctgtcat 
1141 caagaagccc ccgcccactc gggctgaagg 
1201 agccgactgc atcatgctgt ctggagaaac 
1261 gcgcatgcag cacctgattg cccgtgaggc 
1321 tgaggaactc cgccgcctgg cgcccattac 
1381 tgccgtggag gcctccttca agtgctgcag 
1441 caggtctgct caccaggtgg ccagataccg 
1501 gaatccccag acagctcgtc aggcccacct 
1561 ggacccagtc caggaggcct gggctgagga 
1621 tgttgggtac gtggctggag caggggctag 
1681 ttggccacca acctcccttc tcttcctcca 
1741 gatgtggtca ttgtgctgac cggatggcgc 
1801 gttgttcctg tgccgtgatg gaccccagag 
1861 cccccagccc atccattagg ccagcaacgc 
1921 cactggtagg ttgggacacc agggaagaag 
1981 tgcagcctgc tctagtggga cagcccagag 
2041 atcaagggaa gaaggaggaa tgctggactg 
2101 gacagcttcc tttcctgtgt gtactctgtc 
2161 ggactcccaa ccctggcttg gggtcaagaa 
2221 cactgggctg ttgttccatt gaagccgact 
2281 tctctaggcc tctccagttt gcacctgtcc 
2341 cactccaccc tccaccttcc atttccccca 
2401 agagcctacc tgtatgtaat aaa 



ggagaccatc aagaatgtgc gcacagccac 
ccggcccgtt gctgtggctc tagacactaa 
gggcagcggc actgcagagg tggagctgaa 
taacgcctac atggaaaagt gtgacgagaa 
caaggtggtg gaagtgggca gcaagatcta 
gaagcagaaa ggtgccgact tcctggtgac 
caagaagggt gtgaaccttc ctggggctgc 
catccaggat ctgaagtttg gggtcgagca 
ccgcaaggca tctgatgtcc atgaagttag 
caagattatc agcaaaatcg agaatcatga 
ggccagtgat gggatcatgg tggctcgtgg 
ggtcttcctt gctcagaaga tgatgattgg 
ctgtgctact cagatgctgg agagcatgat 
cagtgatgtg gccaatgcag tcctggatgg 
agccaaaggg gactatcctc tggaggctgt 
agaggctgcc atctaccact tgcaattatt 
cagcgacccc acagaagcca ccgccgtggg 
tggggccata atcgtcctca ccaagtctgg 
cccacgtgcc cccatcattg ctgtgacccg 
gtaccgtggc atcttccctg tgctgtgcaa 
cgtggacctc cgggtgaact ttgccatgaa 
agcctagagg agcttgggga tgcttgagca 
ggcaaggccc gaggcttctt caagaaggga 
cctggctccg gcttcaccaa caccatgcgt 
cccctcctcc agcccctgtc ccaccccctt 
ttgtagaact cactctgggc tgtaacgtgg 
atcaacgcct cactgaaaca tggctgtgtt 
cctggctgcc ccatcatgtg gccccaccca 
gaggcccctg gagccagatg gcaagagggt 
cagttccttt agaaaaaatg gatgcccaga 
acagccagca agagttaggg gtccttaggg 
ctggccctgg cccttacttg cttctctagc 
ccaccctcca ctcagctgtc ctgcagcaaa 
ctactgcagc acctccaggc ctgttgctat 



FIGURE 28C: Predicted amino acid sequence of human human pyruvate kinase, 
muscle, Ml isozyme (SEQ ID NO:28) 

1 mskphseagt afiqtqqlha amadtflehm crldidsppi tarntgiict igpasrsvet 
61 Ikemiksgmn varlnfshgt heyhaetikn vrtatesfas dpilyrpvav aldtkgpeir 
121 tglikgsgta evelkkgatl kitldnayme kcdenilwld yknickwev gskiyvddgl 
181 islqvkqkga dflvteveng gslgskkgvn lpgaavdlpa vsekdiqdlk fgveqdvdmv 
241 fasfirkasd vhevrkvlge kgknikiisk ienhegvrrf deileasdgi mvargdlgie 
301 ipaekvflaq kmmigrcnra gkpvicatqm lesmikkprp traegsdvan avldgadcim 
3 61 lsgetakgdy pleavrmqhl iareaeaaiy hlqlfeelrr lapitsdpte atavgaveas 
421 fkccsgaiiv ltksgrsahq varyrprapi iavtrnpqta rqahlyrgif pvlckdpvqe 
481 awaedvdlrv nfamnvgkar gffkkgdwi vltgwrpgsg ftntmrwpv p 

FIGURE 28D: Predicted amino acid sequence of human human pyruvate kinase, 
muscle, M2 isozyme (SEQ ID NO:29) 

1 mskphseagt afiqtqqlha amadtflehm crldidsppi tarntgiict igpasrsvet 
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61 lkemiksgmn varlnfshgt heyhaetikn 

121 tglikgsgta evelkkgatl kitldnayiae 

181 islqvkqkga dflvteveng gslgskkgvn 

241 fasfirkasd vhevrkvlge kgknikiisk 

3 01 ipaekvflaq kmmigrcnra gkpvicatqm 

3 61 lsgetakgdy pleavrmqhl iareaeaaiy 

421 fkccsgaiiv Itksgrsaliq varyrprapi 

481 awaedvdlrv nf amxivgkar gffkkgdwi 



vrtatesfas dpilyrpvav aldtkgpeir 
kcdenilwld yknickwev gskiyvddgl 
lpgaavdlpa vsekdiqdlk fgveqdvdmv 
ienhegvrrf deileasdgi mvargdlgie 
lesmikkprp traegsdvan avldgadcim 
hlqlfeelrr lapitsdpte atavgaveas 
iavtrnpqta rqalilyrgif pvlckdpvqe 
vltgwrpgsg f tntmrwpv p 



FIGURE 28E: Predicted nucleotide sequence encoding human pyruvate kinase, liver 
and RBC (PKLR) (SEQ ID NO:30) 

1 gcagccccag gcccacactg aaagcatgtc gatccaggag aacatatcat ccctgcagct 
61 tcggtcatgg gtctctaagt cccaaagaga cttagcaaag tccatcctga ttggggctcc 
121 aggagggcca gcggggtatc tgcggcgggc cagtgtggcc caactgaccc aggagctggg 
181 cactgccttc ttccagcagc agcagctgcc agctgctatg gcagacacct tcctggaaca 
241 cctctgccta ctggacattg actccgagcc cgtggctgct cgcagtacca gcatcattgc 
3 01 caccatcggg ccagcatctc gctccgtgga gcgcctcaag gagatgatca aggccgggat 
3 61 gaacattgcg cgactcaact tctcccacgg ctcccacgag taccatgctg agtccatcgc 
421 caacgtccgg gaggcggtgg agagctttgc aggttcccca ctcagctacc ggcccgtggc 
481 catcgccctg gacaccaagg gaccggagat ccgcactggg atcctgcagg ggggtccaga 
541 gtcggaagtg gagctggtga agggctccca ggtgctggtg actgtggacc ccgcgttccg 
601 gacgcggggg aacgcgaaca ccgtgtgggt ggactacccc aatattgtcc gggtcgtgcc 
661 ggtggggggc cgcatctaca ttgacgacgg gctcatctcc ctagtggtcc agaaaatcgg 
721 cccagaggga ctggtgaccc aagtggagaa cggcggcgtc ctgggcagcc ggaagggcgt 
781 gaacttgcca ggggcccagg tggacttgcc cgggctgtcc gagcaggacg tccgagacct 
841 gcgcttcggg gtggagcatg gggtggacat cgtctttgcc tcctttgtgc ggaaagccag 
901 cgacgtggct gccgtcaggg ctgctctggg tccggaagga cacggcatca agatcatcag 
961 caaaattgag aaccacgaag gcgtgaagag gtttgatgaa atcctggagg tgagcgacgg 
1021 catcatggtg gcacgggggg acctaggcat cgagatccca gcagagaagg ttttcctggc 
1081 tcagaagatg atgattgggc gctgcaactt ggcgggcaag cctgttgtct gtgccacaca 
1141 gatgctggag agcatgatta ccaagccccg gccaacgagg gcagagacaa gcgatgtcgc 
1201 caatgctgtg ctggatgggg ctgactgcat catgctgtca ggggagactg ccaagggcaa 
1261 cttccctgtg gaagcggtga agatgcagca tgcgattgcc cgggaggcag aggccgcagt 
1321 gtaccaccgg cagctgtttg aggagctacg tcgggcagcg ccactaagcc gtgatcccac 
1381 tgaggtcacc gccattggtg ctgtggaggc tgccttcaag tgctgtgctg ctgccatcat 
1441 tgtgctgacc acaactggcc gctcagccca gcttctgtct cggtaccgac ctcgggcagc 
1501 agtcattgct gtcacccgct ctgcccaggc tgcccgccag gtccacttat gccgaggagt 
1561 cttccccttg ctttaccgtg aacctccaga agccatctgg gcagatgatg tagatcgccg 
1621 ggtgcaattt ggcattgaaa gtggaaagct ccgtggcttc ctccgtgttg gagacctggt 
1681 gattgtggtg acaggctggc gacctggctc cggctacacc aacatcatga gggtgctaag 
1741 catatcctga gacgcccctc ccccctctgg cccagcctac ccttgtaccc catcccttcc 
1801 tccccagtct acgttctcca gcccacaccc ctccaaagcc ccacctttaa gtcctctctt 
1861 ctctattcct gaccctccct acctgaggcc tatctgagac tataactgtc atctagcccc 
1921 ttcgaggttg ccccttcccc atctccattt cacacaggtc ctgaaagtct gtgtccaatt 
1981 atgcactggc cacccaacag caccaattgt acattctctg catccaatct gctcagcagg 
2041 ccctaagatg ccttgagtct ttaatcccaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 
2101 aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 
2161 aaaaaaaaa 



FIGURE 28F: Predicted amino acid sequence of human human PKLR (SEQ ID NO:31) 

1 msiqenissl qlrswvsksq rdlaksilig apggpagylr rasvaqltqe lgtaffqqqq 
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61 Ipaamadtfl ehlclldids epvaarstsi 
121 hgsheyhaes ianvr eaves fagsplsyrp 
181 sqylvtvdpa frtrgnantv wvdypnivrv 
241 enggvlgsrk gvnlpgaqvd lpglseqdvr 
3 01 lgpeghgiki iskienhegv krfdeilevs 
3 61 nlagkpwca tqmlesmitk prptraetsd 
421 qhaiareaea avyhrqlfee lrraaplsrd 
481 aqllsryrpr aaviavtrsa qaarqvhlcr 
541 klrgflrvgd Iviwtgwrp gsgytnimrv 



iatigpasrs verlkemika gmniarlnfs 
vaialdtkgp eirtgilqgg pesevelvkg 
vpvggriyid dglislwqk igpeglvtqv 
dlrfgvehgv divfasfvrk asdvaavraa 
dgimvargdl gieipaekvf laqkmmigrc 
vanavldgad cimlsgetak gnfpveavkm 
ptevtaigav eaafkccaaa iivltttgrs 
gvfpllyrep peaiwaddvd rrvqfgiesg 
lsis 
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FIGURE 29. CLUSTAL W (1.7) Protein Sequence Alignment Analysis 



pk3_h2 MSKPHSKAGTAFIQTQQLHAAMADTFLEHMCRLDIDSPPITARNTGIICTIGPASRSVET 

pk3_h MSKPHSEAGTAFIQTQQLHAAMADTFIiEHMCRLDIDSPPITARNTGIICTIGPASRSVET 

pk3__m MPKPHSEAGTAFIQTQQLHAAMADTFLEHMCRLDIDSAPITARNTGIICTIGPASRSVEM 

pk3_dro 

pk3_h2 LKEMIKSGiyQWARLNFSHGTHEYHAETIKJSIVRTATESFASDPILYRPVAVALDTKGPEIR 

pk3_h LKEMI KS GMIWARLNFSHGTHEYHAET IKNVRT ATESF ASDPI LYRPVAVALDTKGPE I R 

pk3_m IiKEMIKSGMWARLNFSHGTHEYHAETIKNVREATESFASDPILYRPVAVALDTKGPEIR 

pk3_dro MRVVRMNFSHGSHEYHCQTIQAARKAIAMYV^QTGLPRTLAIALDTKGPEIR 

* * * : *****.**** ^ .** . m * * * * . : *: ********** 



pk3_h2 TGLIKGSG-TAEVELKXGATLKITLDNAYMEKCDENILWLDYKNICKWEVGSKIYVDDG 

pk3_h TGLIKGSG-TAEVELKXGATLKITLDNAYMEKCDENILWLDYKNICKWEVGSKIYVDD 

pk3_m TGLIKGSG-TAEVELKKGATLKITLDNAYMEKCDENILWLDYKNICKWEVGSKIYVDDG 

pk3_dro TGKLAGGKTORAEIELKTGDKVTLSTKKEMADKSNKDNIYVDYQRLPQLVKPGNRVFVDDG 

* * . * **.*** * . . . * ... •••**• . . . * . * ...**** 

. .. . . ..... .. . .... ... ... .. . .... 

pk3_h2 L I SL QVKQKGADFLVTEVENGGS LGSKKGV3STLPGAAVDL PAVS EKD I QDLKFGVEQDVDM 

pk3_h LI SLQVKQKGADFLVT WENGGSLGSKKGVXtfLPGAAVDLPAVS EKDI QDLKFGVEQDVDM 

pk3_m LI SLQVKEKGADFLVTEVENGGSLGSKKGV3STLPGAAVDLPAVS EKDI QDLKFGVEQDVDM 

pk3_dro LIALIVKESKGDEVICQVENGGKLGSHKGINLPGVPVDLPSVTEKDKQDLKFGAEQKVDM 

**.* **. * .. .***** ***.**.**** ****.*.*** ****** ** *** 
. ...... . .. .. .. .. 

pk3__h2 VFASFIRKASDVHEVRKVLGEKGKNIKIISKIE^ 

pk3_h VFASFIRKASDVHEVRKVLGEKGKNIKI I SKIENHEGVRRFDEILEASDGIMVARGDLGI 

pk3_m VF AS F I RKAADVHEVRKVLGEKGKNIKI I SKI ENHEGVRRFDE I LEAS DGIMVARGDLGI 

pk3__dro IFASFIRDANALKEIRQVLGPAGACIKIISKIENHQGLWIDDIIRESDGIMVARGDMGI 
.****** * ..*.*.*** * **********.*. . * . * . **********.** 



pk3_h2 EIPAEKVFLAQKMMIGRCNRAGKPVICATQMLESMIKKPRPTRAEGSDVANAVLDGADCI 

pk3_h EI PAEKVFLAQKMMI GRCNRAGKPVI CAT QMLESMI KKPRPTRAEGSDVANAVLDGADC I 

pk3_m EIPAEKVFLAQKMMIGRC3STEIAGKPVICATQMLESMIKKPRPTRAEGSDVANAVLDGADCI 

pk3_dro EI PTEDVPLAQKS I VAKCNKVGKPVI CATQMMESMTNKPRPTRAEASDVAJSTAIFDGCDAV 

***.** **** ...**.**********.*** .**************..***. 

pk3_h2 MLSGETAKGDYPLEAVRMQNLIAREAEAAIYHLQLFEELRR-LAPITSDPTEATAVGAVE 
pk3__h MLSGETAKGDYPLEAVRMQHLIAREAEAAIYHLQLFEELRR-LAPITSDPTEATAVGAVE 
pk3_m ML SGETAKGDYPLEAVRMQHL I AREAEAAI YHLQLFEELRR- L API T S DPTEAAAVGAVE 

pk3__dro MLSGETAKGKYPVECVQCMARICAKVEAVLWYESLQNSLKREIRTSAADHISAVTTAIAE 
***********.**. * ^ :„**.::: . * : . * : * : . : : * . * 



pk3_h2 ASFKCCSGAI IVLTKSGRSAHQVARYRPRAPIIAVTRNPQTARQAHLYRGIFPVLCKDPV 

pk3_h ASFKCCSGAI I VLTKSGRSAHQVARYRPRAPI I AVTRNPQTARQAHLYRGIFPVLCKDPV 

pk3_m ASFKCCSGAI IVLTKSGRSAHQVARYRPRAPI I AVTRNPQTARQAHLYRGIFPVLCKDAV 

pk3_dro AATVGQARAIWASPCSMVAQMVSHMRPPCPIVMLTGNESEAAQSLLFRGIYPLLVEEMV 
*. . **.* . *. *.. ** ^ ** . . * * ^ * *. *.***.*.* * 

pk3_h2 QEAWAEDVDLRVNFAMNVGKARGFFK — KGDWIVLTGWRPGSGF- TNTMRWPVP 

pk3_h QEAWAEDVDLRVNFAMNVGKARGFFK- - KGDWI VLTGWRPGS GF - TNTMRWPVP 

pk3_m LNAWAEDVDLRVNLAMDVGKARGFFK — KGDWI VLTGWRPGS GF - TNTMRWPVP 

pk3_dro IGSFNFRRIMQSGLKL-MGKMDILEPGQKGSWLVNAMSAEKITFRLFTIRQQTKEERDQ 
:: ::.:::** : ** - **.*. * *.* 
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pk3_h2 

pk3_h 

pk3__m ^ 

pk3_dro DERCRKLALEQSCKERAEKEECRKLQQAEECQKQKLAKKCKQFEEKQKVCPKKiaDTPKND 

pk3_h2 

pk3_h 

pk3__rn 

pk3_dro C PKKDC PKKEC PKQDDEI SKCRQMQEAEAEERKCKEEFEQMCKLAEEKRKEAEKCRKADE 

pk3Jb.2 

pk3Jh. 

pk3_m 

pk3„dro ERRJCEEAEKCRKLEEDRKCKLAEEKKRNEEELKIIEAEVAKLEAAEKAKRLKEEEKKKEE 

pk3_h2 

pk3_h 

pk3__m 

pk3_dro LMKCKQRNEAKKKREEAERCKRKERERELAEMElSnECWKQVAEKRKRKKAAEMCRKIEDAKE 

pk3_h2 : 

pk3_h 

pk3_m 

pk3_dro KAAAESADKILKAVCEKLKQSLSDPDKSKKGKK 
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FIGURE 32: HUMAN HOMOLOG OF CG9429 (Calreticulin, crc) 



FIGURE 32A. BLASTP search result for crc (Gadfly Accession Number CG9429) 

ref |NP_004334.1 [ (NM__004343 ) calreticulin precursor; Sicca syndrome antigen A 

(autoantigen Ro; calreticulin) ; autoantigen Ro [Homo sapiens] 
Length = 417 



Score = 575 bits (1467), Expect = e-163 

Identities = 269/404 (66%), Positives = 317/404 (77%), Gaps = 5/404 (1%) 



Query: 6 WIVLLATVGFISAE — VYLKENF-DNENWEDTWIYSKHPGKEFGKFVLTPGTFYNDAEA 62 

+V +LL +G AE VY KE F D + W WI SKH +FGKFVL+ G FY D E 
Sbjct: 4 SVPLLLGLLGLAVAEPAVYFKEQFLDGDGWTSRWIESKHKS-DFGKFVLSSGKFYGDEEK 62 

Query: 63 DKGIQTSQDARFYAASRKFDGFSNEDKPLWQFSVKHEQNIDCGGGYVKLFDCSLDQTDM 122 

DKG+QTSQDARFYA S F+ FSN-f + LWQF +VKHEQNIDCGGGYVKLF SLDQTDM 
Sbjct : 63 DKGLQTSQDARFYALSASFEPFS1STKGQTLWQFTVKHEQNIDCGGGYVKLFPNSLDQTDM 122 

Query: 123 HGESPYEIMFGPDICGPGTKKVHVIFSYKGKNHLISKDIRCKDDWTHFYTLIVRPDNTY 182 

HG+S Y IMFGPDICGPGTKKVHVIF+YKGKN LX+KDXRCKDD +TH YTL IVRPDNT Y 
Sbjct: 123 HGDSEYNIMFGPDICGPGTKK^THVIFNYKGKNVL^ 182 

Query: 183 WLIDNEKVESGNLEDDWDFLAPKKIKDPTATKPEDWDDRATIPDPDDKKPEDWDKPEHI 242 

EV IDN +VESG+LEDDWDFL PKKIKDP A+KPEDWD-t-RA I DP D KPEDWDKPEHI 
Sbjct: 183 EV1CIDNSQVESGSLEDDWDFLPPKKIKDPDASKPEDWDERAKIDDPTDSKPEDWDKPEHI 242 

Query: 243 PDPDATKPEDWDDEMDGEWEPPMIDNPEFKGEWQPKQLDNPISrYKGAWEHPEIANPEYVPD 302 

PDPDA KPEDWD+EMDGEWEPP+I NPE+KGEW+ P+Q+DNP+ YKG W HPEI NPEY PD 
Sbjct: 243 PDPDAKKPEDWDEEMDGEWE P PVI QNPE YKGEWKPRQ I DNPD YKGTWI HPE IDNPEYS PD 3 02 

Query: 3 03 DKLYLRKEI CTLGFDL WQVKSGT I FDNVL ITDDVELAAKAAAEVKN- TQAGEKKMKEAQD 3 61 

+Y LG DIj WQVKSGT I FDN LIT+D A + E T+A EK+MK+ QD 

Sbjct: 3 03 PSIYAYDNFGVLGLDLWQWSGTIFDNFLITNDEAYAEEFGK^ 3 62 

Query: 3 62 EVQRKKDEEEAKKASDKDDEDEDDDDEEKDDESKQDKDQSEHDE 405 

E QR K-fEEE KK ++++ ++ +DDE+KD++ + ++D+ E +E 
Sbjct: 363 EEQRLKEEEEDKKRKEEEEAEDKEDDEDKDEDEEDEEDKEEDEE 406 
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FIGURE 32B: Predicted nucleotide sequence encoding human Calreticulin 
(SEQ ID NO:32) 

1 gtccgtactg cagagccgct gccggagggt cgttttaaag ggccgcgttg ccgccccctc 
61 ggcccgccat gctgctatcc gtgccgctgc tgctcggcct cctcggcctg gccgtcgccg 
121 agcccgccgt ctacttcaag gagcagtttc tggacggaga cgggtggact tcccgctgga 
181 tcgaatccaa acacaagtca gattttggca aattcgttct cagttccggc aagttctacg 
241 gtgacgagga gaaagataaa ggtttgcaga caagccagga tgcacgcttt tatgctctgt 
3 01 cggccagttt cgagcctttc agcaacaaag gccagacgct ggtggtgcag ttcacggtga 
3 61 aacatgagca gaacatcgac tgtgggggcg gctatgtgaa gctgtttcct aatagtttgg 
421 accagacaga catgcacgga gactcagaat acaacatcat gtttggtccc gacatctgtg 
481 gccctggcac caagaaggtt catgtcatct tcaactacaa gggcaagaac gtgctgatca 
541 acaaggacat ccgttgcaag gatgatgagt ttacacacct gtacacactg attgtgcggc 
601 cagacaacac ctatgaggtg aagattgaca acagccaggt ggagtccggc tccttggaag 
661 acgattggga cttcctgcca cccaagaaga taaaggatcc tgatgcttca aaaccggaag 
721 actgggatga gcgggccaag atcgatgatc ccacagactc caagcctgag gactgggaca 
781 agcccgagca tatccctgac cctgatgcta agaagcccga ggactgggat gaagagatgg 
841 acggagagtg ggaaccccca gtgattcaga accctgagta caagggtgag tggaagcccc 
9 01 ggcagatcga caacccagat tacaagggca cttggatcca cccagaaatt gacaaccccg 
961 agtattctcc cgatcccagt atctatgcct atgataactt tggcgtgctg ggcctggacc 
1021 tctggcaggt caagtctggc accatctttg acaacttcct catcaccaac gatgaggcat 
1081 acgctgagga gtttggcaac gagacgtggg gcgtaacaaa ggcagcagag aaacaaatga 
1141 aggacaaaca ggacgaggag cagaggctta aggaggagga agaagacaag aaacgcaaag 
1201 aggaggagga ggcagaggac aaggaggatg atgaggacaa agatgaggat gaggaggatg 
1261 aggaggacaa ggaggaagat gaggaggaag atgtccccgg ccaggccaag gacgagctgt 
1321 agagaggcct gcctccaggg ctggactgag gcctgagcgc tcctgccgca gagcttgccg 
13 81 cgccaaataa tgtctctgtg agactcgaga actttcattt ttttccaggc tggttcggat 
1441 ttggggtgga ttttggtttt gttcccctcc tccactctcc cccaccccct ccccgccctt 
1501 tttttttttt tttttaaact ggtattttat cctttgattc tccttcagcc ctcacccctg 
1561 gttctcatct ttcttgatca acatcttttc ttgcctctgt gccccttctc tcatctctta 
1621 gctcccctcc aacctggggg gcagtggtgt ggagaagcca caggcctgag atttcatctg 
1681 ctctccttcc tggagcccag aggagggcag cagaaggggg tggtgtctcc aaccccccag 
1741 cactgaggaa gaacggggct cttctcattt cacccctccc tttctcccct gcccccagga 
1801 ctgggccact tctgggtggg gcagtgggtc ccagattggc tcacactgag aatgtaagaa 
1861 ctacaaacaa aatttctatt aaattaaatt ttgtgtctc 



FIGURE 32C: Predicted amino acid sequence of human Calreticulin (SEQ ID NO:33) 

1 mllsvplllg llglavaepa vyfkegfldg dgwtsrwies khksdfgkfv lssgkfygde 

61 ekdkglqtsq darfyalsas fepfsnkgqt lwqftvkhe qnidcgggyv klfpnsldqt 

121 dmhgdseyni mf gpdicgpg tkkvhvifny kgknvlinkd irckddefth lytlivrpdn 

181 tyevkidnsq vesgsleddw dflppkkikd pdaskpedwd erakiddptd skpedwdkpe 

241 hipdpdakkp edwdeemdge weppviqnpe ykgewkprqi dnpdykgtwi hpeidnpeys 

3 01 pdpsiyaydn fgvlgldlwq vksgtifdnf litndeayae efgnetwgvt kaaekqmkdk 

3 61 qdeeqrlkee eedkkrkeee eaedkedded kdedeedeed keedeeedvp gqakdel 
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FIGURE 32D: Predicted nucleotide sequence encoding human Calreticulin 2 
(SEQ ID NO:34) 

1 agcggagagg cgcagagaga gctgggagct aaggggtggc ggcgaccgga agcgcagtgc 
61 acacccccat ggcccgggct ttggtccagt tctgggccat atgcatgctg cgagtggcgc 
121 tggctaccgt ctatttccaa gaggaatttc tagacggaga gcattggaga aaccgatggt 
181 tgcagtccac caatgactcc cgatttgggc attttagact ttcgtcgggc aagttttatg 
241 gtcataaaga gaaagataaa ggtctgcaaa ccactcagaa tggccgattc tatgccatct 
3 01 ctgcacgctt caaaccgttc agcaataaag ggaaaactct ggttattcag tacacagtaa 
3 61 aacatgagca gaagatggac tgtggagggg gctacattaa ggtctttcct gcagacattg 
421 accagaagaa cctgaatgga aaatcgcaat actatattat gtttggaccc gatatttgtg 
481 gatttgatat caagaaagtt catgttattt tacatttcaa gaataagtat cacgaaaaca 
541 agaaactgat caggtgtaag gttgatggct tcacacacct gtacactcta attttaagac 
601 cagatctttc ttatgatgtg aaaatjtgatg gtcagtcaat tgaatccggc agcatagagt 
661 acgactggaa cttaacatca ctcaagaagg aaacgtcccc ggcagaatcg aaggattggg 
721 aacagactaa agacaacaaa gcccaggact gggagaagca ttttctggac gccagcacca 
781 gcaagcagag cgactggaac ggtgacctgg atggggactg gccagcgccg atgctccaga 
841 agcccccgta ccaggatggc ctgaaaccag aaggtattca taaagacgtc tggctccacc 
9 01 gtaagatgaa gaataccgac tatttgacgc agtatgacct ctcagaattt gagaacattg 
9 61 gtgccattgg cctggagctt tggcaggtga gatctggaac catttttgat aactttctga 
1021 tcacagatga tgaagagtat gcagataatt ttggcaaggc cacctggggc gaaaccaagg 
1081 gtccagaaag ggagatggat gccatacagg ccaaggagga aatgaagaag gcccgcgagg 
1141 aagaggagga agagctgctg tcgggaaaaa ttaacaggca cgaacattac ttcaatcaat 
1201 ttcacagaag gaatgaactt tagtgatccc cattggatat aaggatgact ggtaaaatct 
12 61 cattgctact ttaatctaaa aaaaaaaaaa aaa 



FIGURE 32E: Predicted amino acid sequence of human Calreticulin 2 (SEQ ID NO:35) 

1 maralvgfwa icmlrvalat vyfqeefldg ehwrnrwlqs tndsrfghfr lssgkfyghk 

61 ekdkglqttq ngrfyaisar fkpfsnkgkt lviqytvkhe qkmdcgggyi kvfpadidqk 

121 nlngksqyyi mfgpdicgfd ikkvhvilhf knkyhenkkl irckvdgfth lytlilrpdl 

181 sydvkidgqs iesgsieydw nltslkkets paeskdweqt kdnkaqdwek hfldastskq 

241 sdwngdldgd wpapmlqkpp yqdglkpegi hkdvwlhrkm kntdyltqyd Isefenigai 

3 01 glelwqvrsg tifdnflitd deeyadnfgk atwgetkgpe reiadaiqake emkkareeee 
3 61 eellsgkinr hehyfnqfhr rnel 
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FIGURE 33. CLUSTAL W (1.82) Protein Sequence Alignment Analysis 

crc Dm MMWCKTVIVLLAW 

crc Hs MLLSVPLLLGLIiGLAVAEPAVYFKEQFLDGDGWTSRWIESKHKS-DFGKFVLSSGKFYGD 

MGC26577 Hs MARAWQFWAICMLRVALATW 

crc Dm AEADKGIQTSQDARFYAASRKFDGFSNEDKPLWQFSVKHEQNIDCGGGYVKLFDCSIiDQ 

crc Hs EEKDKGLQTSQDARFYALSASFEPFSISTKGQTLWQFTVKHEQNIDCGGGYVKLFPNSLDQ 

MGC2 6577 Hs KEKDKGLQTTQNGRFYAISARFKPFSNKGKTLVIQYTVKHEQKMDCGGGYIKVFPADIDQ 

crc Dm TDlffiGESPYEIMFGPDICGPGTKKVHVIFSYKGKNHLISKDIRCKDDVYTHFYTLIVRPD 

crc Hs TDMHGDSEYNIMFGPDICGPGTKKVHVIFISnrKGKNV^ 

MGC26577 Hs KNLNGKSQYYIMFGPDICGFDIKKVH^ 

Crc Dm NTYEVLIDNEKVESGNLEDDWDFLAPKKIKDPTATKPEDWDDRATIPDPDDKKPEDWDKP 

crc Hs NTYEVKIDNSQVESGSLEDDWDFLPPKKIKDPDASKPEDWDERAKIDDPTDSKPEDWDKP 

MGC26577 Hs LSYDVKIDGQSIESGSIEYDWNLTSLKKETSPAESK — DWEQTK DN3SAQDWEK- 

crc Dm EHIPDPDATKPEDWDDEMDGEWEPPMIDNPEFKGEWQPKQLDNPNYKGAWEHPEIANPEY 

crc Hs EHI PDPDAKKPEDWDEEMDGEWEPPVI QNPEYKGEWKPRQIDNPDYKGTWIHPEIDNPE Y 

MGC2 6577 Hs -HFLDASTSKQSDWNGDLDGDWPAPMLQKPPYQDGLKPEGIH KDVWLHRKMKNTDY 

crc Dm VPDDKLYLRKEICTLGFDLWQVKSGTIFDNVLITO^ 

crc Hs S PDPS I YAYDNFGVLGLDLWQVKSGT I FDNFL X TNDEAYAEEFGNETWGVTKAAEKQMKD 

MGC26577 Hs LTQYDLSEFENIGAIGLELWQVRSGTIFDNFLITDDEEYADNFGKATWGETKGPEREMDA 

crc Dm AQDEVQRKKDEEEAKKASDKDDED- - EDDDDEEKDDESKQDKDQSE HDEL 

crc HS KQDEEQRLKEEEEDKKRKEEEEAEDKEDDEDKDEDEEDEEDKEEDEEEDVPGQAKDEL 

MGC26577 Hs IQAK EEMKKAREEEEEELLSGKINRHEHYFNQFHR RNEL 



